1
0
Fork 0
mirror of https://gitlab.com/news-flash/article_scraper.git synced 2025-07-07 16:15:32 +02:00

add clean links test

This commit is contained in:
Jan Lukas Gernert 2023-03-09 21:24:29 +01:00
parent c5c6b788c8
commit 3ece2522bb
5 changed files with 3258 additions and 178 deletions

View file

@ -1,170 +0,0 @@
<article><DIV id="readability-page-1"><article itemscope="itemscope" itemtype="https://schema.org/NewsArticle"><meta itemprop="datePublished" content="2019-04-30T13:39:00-04:00">
<meta itemprop="dateModified" content="2019-04-30T13:40:00-04:00">
<meta itemprop="mainEntityOfPage" content="https://www.citylab.com/design/2019/04/neon-signage-20th-century-history/588400/">
<figure itemprop="image" itemscope="itemscope" itemtype="http://schema.org/ImageObject"><picture><source srcset="https://cdn.citylab.com/media/img/citylab/2019/04/mr1/940.jpg?mod=1556645448" media="(min-width: 1024px)"></source><source srcset="https://cdn.citylab.com/media/img/citylab/2019/04/mr1/lead_large.jpg?mod=1556645448" media="(min-width: 576px)"></source></picture><meta itemprop="height" content="128">
<meta itemprop="width" content="300">
<meta itemprop="url" content="https://cdn.citylab.com/media/img/citylab/2019/04/mr1/300.jpg?mod=1556645448">
<picture><source srcset="https://cdn.citylab.com/media/img/citylab/2019/04/mr1/300.jpg?mod=1556645448" media="(max-width: 575px)"></source><img src="https://cdn.citylab.com/media/img/citylab/2019/04/mr1/300.jpg?mod=1556645448" alt=""></picture><figcaption><span itemprop="caption">The Moulin Rouge cabaret in
Paris</span><span itemprop="creator">Benoit
Tessier/Reuters</span></figcaption></figure><div>
<h2 itemprop="headline">
Why Neon Is the Ultimate Symbol of the 20th Century
</h2>
<div><p><span><time>1:39 PM
ET</time></span></p></div>
</div>
<h2 itemprop="description">
The once-ubiquitous form of lighting was novel when it first emerged in the early 1900s,
though it has since come to represent decline.
</h2>
<section id="article-section-1"><p>
In the summer of 1898, the Scottish chemist Sir William Ramsay made a discovery that
would eventually give the Moulin Rouge in Paris, the Las Vegas Strip, and New Yorks
Times Square their perpetual nighttime glow. Using the boiling point of argon as a
reference point, Ramsay and his colleague Morris W. Travers isolated three more noble
gases and gave them evocative Greek names: neon, krypton, and xenon. In so doing, the
scientists bestowed a label of permanent novelty on the most famous of the trio—neon,
which translates as “new.” This discovery was the foundation on which the French
engineer Georges Claude crafted a new form of illumination over the next decade. He
designed glass tubes in which neon gas could be trapped, then electrified, to create a
light that glowed reliably for more than 1,000 hours.
</p>
<p>
In the 2012 book <em>Lêtre et le Néon</em>, <a href="https://mitpress.mit.edu/books/being-and-neonness-translation-and-content-revised-augmented-and-updated-edition-luis-de-miranda" target="_blank">which
has been newly translated into English by Michael Wells</a>, the philosopher Luis de
Miranda weaves a history of neon lighting as both artifact and metaphor. <em>Being and
Neonness</em>, as the book is called in its English edition, isnt a typical
material history. There are no photographs. Even de Mirandas own example of a neon deli
sign spotted in Paris is re-created typographically, with text in all caps and dashes
forming the border of the sign, as one might attempt on Twitter. Fans of Miami Beachs
restored Art Deco hotels and Californias bowling alleys might be disappointed by the
lack of glossy historical images. Nonetheless, de Miranda makes a convincing case for
neon as a symbol of the grand modern ambitions of the 20th century.
</p>
<p>
De Miranda beautifully evokes the notion of neon lighting as an icon of the 1900s in his
introduction: “When we hear the word <em>neon</em>, an image pops into our heads: a
combination of light, colors, symbols, and glass. This image is itself a mood. It
carries an atmosphere. It speaks … of the essence of cities, of the poetry of nights, of
the 20th century.” When neon lights debuted in Europe, they seemed dazzlingly
futuristic. But their husky physicality started becoming obsolete by the 1960s, thanks
in part to the widespread use of plastic for fluorescent signs. Neon signs exist today,
though theyve been eclipsed by newer technologies such as digital billboards, and they
remain charmingly analog: Signs must be made by hand because theres no cost-effective
way to mass-produce them.
</p>
<p>
In the 1910s, neon started being used for cosmopolitan flash in Paris at precisely the
time and place where the first great modernist works were being created. De Mirandas
recounting of the ingenuity emerging from the French capital a century ago is thrilling
to contemplate: the cubist art of Pablo Picasso, the radically deconstructed fashions of
Coco Chanel, the stream-of-consciousness poetry of Gertrude Stein, and the genre-defying
music of Claude Debussy—all of which heralded a new age of culture for Europe and for
the world.
</p></section><section id="article-section-2"><p>
Amid this artistic groundswell, Georges Claude premiered his neon lights at the <a href="https://www.mondial-paris.com/en/visiteur/auto" target="_blank">Paris Motor Show</a> in
December 1910, captivating visitors with 40-foot-tall tubes affixed to the buildings
exterior. The lights shone orange-red because neon, by itself, produces that color.
<em>Neon lighting</em> is a catchall term that describes the technology of glass tubing
that contains gas or chemicals that glow when electrified. For example, neon fabricators
use carbon dioxide to make white, and mercury to make blue. Claude acknowledged at the
time that neon didnt produce the ideal color for a standard light bulb and insisted
that it posed no commercial threat to incandescent bulbs.
</p>
<p>
Of course, the very quality that made neon fixtures a poor choice for interior lighting
made them perfect for signs, de Miranda notes. The first of the neon signs was switched
on in 1912, advertising a barbershop on Pariss Boulevard Montmartre, and eventually
they were adopted by cinemas and nightclubs. While Claude had a monopoly on neon
lighting throughout the 1920s, the leaking of trade secrets and the expiration of a
series of patents broke his hold on the rapidly expanding technology.
</p></section><section id="article-section-3"><p>
In the following decades, neons nonstop glow and vibrant colors turned ordinary
buildings and surfaces into 24/7 billboards for businesses, large and small, that wanted
to convey a sense of always being open. The first examples of neon in the United States
debuted in Los Angeles, where the Packard Motor Car Company commissioned two large
blue-and-orange <span>Packard</span> signs that literally stopped
traffic because they distracted motorists. The lighting also featured heavily at the
Chicago Century of Progress Exposition in 1933 and at the 1939 Worlds Fair in New York.
At the latter event, a massive neon sign reading <span>Futurama</span>
lit the way to a General Motors exhibition that heralded “The World of Tomorrow.”
</p>
<figure><picture><img alt="" data-srcset="https://cdn.theatlantic.com/assets/media/img/posts/2019/04/AP_8912060228/cbd32b0e1.jpg"></picture><figcaption>
Workers remove a hammer and sickle from a neon sign that reads “Glory to Communism,”
visible on the roof of the Communist-run electricity-board headquarters in
Czechoslovakia in 1989. (AP)
</figcaption></figure><p>
De Miranda points out that businesses werent alone in embracing neons ability to
spread messages effectively. By the middle of the century, the lighting was being
adopted for more political purposes. “In the 1960s, the Soviets deployed a vast
neonization of the Eastern bloc capitals to emulate capitalist metropolises,” de
Miranda writes. “Because consumer shops were rare in the Polish capital [of Warsaw],
they did not hesitate to illuminate the façades of public buildings.” In other words, as
opposed to the sole use of the more obvious forms of propaganda via posters or slogans,
the mass introduction of neon lighting was a way of getting citizens of Communist cities
to see their surroundings with the pizzazz and nighttime glamour of major Western
capitals.
</p></section><section id="article-section-4"><p>
Neon, around this time, began to be phased out, thanks to cheaper and less
labor-intensive alternatives. In addition, the global economic downturn of the 1970s
yielded a landscape in which older, flickering neon signs, which perhaps their owners
couldnt afford to fix or replace, came to look like symbols of decline. Where such
signs were once sophisticated and novel, they now seemed dated and even seedy.
</p>
<section><h2>
Cities are changing fast. Keep up with the <b>CityLab Daily</b> newsletter.
</h2>
<label for="promo-email-input-email">The best way to follow issues you
care about.</label></section><p>
De Miranda understands this evolution by zooming out and looking at the 1900s as the
“neon century.” The author draws a parallel between the physical form of neon lights,
which again are essentially containers for electrified gases, and that of a glass
capsule—suggesting they are a kind of message in a bottle from a time before the First
World War. “Since then, [neon lights] have witnessed all the transformations that have
created the world we live in,” de Miranda writes. “Today, they sometimes seem to
maintain a hybrid status, somewhere between junkyards and museums, not unlike European
capitals themselves.”
</p>
<figure><picture><img alt="" data-srcset="https://cdn.theatlantic.com/assets/media/img/posts/2019/04/AP_945361213236/888fdd750.jpg"></picture><figcaption>
Martin Wartman, a student at Northern Kentucky University, works on a neon sign at
the Neonworks of Cincinnati workshop connected to the American Sign Museum, in 2016.
(John Minchillo / AP)
</figcaption></figure><p>
Another mark of neons hybridity: Its obsolescence started just as some contemporary
artists began using the lights in their sculptures. Bruce Naumans 1968 work <em><a href="https://www.stedelijk.nl/en/collection/1097-bruce-nauman-my-name-as-though-it-were-written-on-the-surface-of-the-moon" target="_blank">My
Name as Though It Were Written on the Surface of the Moon</a></em> poked fun at
the space race—another symbol of 20th-century technological innovation whose moment has
passed. The piece uses blue “neon” letters (mercury, actually) to spell out the name
“bruce” in lowercase cursive, with each character repeated several times as if to convey
a person speaking slowly in outer space. The British artist Tracey Emin has made <a href="https://www.artsy.net/collection/tracey-emin-neon-sculptures-and-prints" target="_blank">sculptures</a>
that resemble neon Valentines Day candies: They read as garish and sentimental
confections with pink, heart-shaped frames that surround blue text fragments. Drawing on
the nostalgia-inducing quality of neon, the sculptures messages are redolent of
old-fashioned movie dialogue, with titles such as “You Loved Me Like a Distant Star” and
“The Kiss Was Beautiful.”
</p>
<p>
Seeing neon lighting tamed in the context of a gallery display fits comfortably with de
Mirandas notion that neon technology is like a time capsule from another age. In
museums, works of neon art and design coexist with objects that were ahead of their own
time in years past—a poignant fate for a technology that made its name advertising “The
World of Tomorrow.” Yet today neon is also experiencing a kind of craft revival. The
fact that it cant be mass-produced has made its fabrication something akin to a
cherished artisanal technique. Bars and restaurants hire firms such as Let There Be Neon
in Manhattan, or <a href="https://www.instagram.com/theneonqueen/" target="_blank">the L.A.-based master
neon artist Lisa Schulte</a>, to create custom signs and works of art. Neons story
even continues to glow from inside museums such as Californias <a href="https://www.neonmona.org/" target="_blank">Museum of Neon Art</a> and the Neon Museum in Las
Vegas. If it can still be a vital medium for artists and designers working today,
“neonness” need not only be trapped in the past. It might also capture the mysterious
glow of the near future—just as it did a century ago.
</p>
<p><em>This article originally appeared on <a href="https://www.theatlantic.com/entertainment/archive/2019/04/being-and-neonness-neon-lights-symbol-20th-century/588184/" target="_blank">The
Atlantic</a>.</em></p></section><section data-include="css:https://cdn.citylab.com/static/a/frontend/dist/citylab/css/components/author-article.cf4e8e0b143f.css"><h4>
About the Author
</h4>
<div itemprop="author">
<h5 itemprop="name"><a href="https://www.citylab.com/authors/sarah-archer/" target="_blank">Sarah Archer</a></h5>
<p itemprop="description"><a href="https://www.citylab.com/authors/sarah-archer/" data-omni-click="inherit" target="_blank">Sarah Archer</a> is the author of <em>The
Midcentury Kitchen</em>.
</p>
</div></section></article></DIV></article>

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -502,16 +502,20 @@ impl FullTextParser {
let node_vec = Util::evaluate_xpath(context, xpath, false)?;
for mut node in node_vec {
if let Some(url) = node.get_attribute(attribute) {
let trimmed_url = url.trim();
let is_relative_url = url::Url::parse(&url)
.err()
.map(|err| err == url::ParseError::RelativeUrlWithoutBase)
.unwrap_or(false);
if is_relative_url {
let completed_url = article_url.join(&url)?;
node.set_attribute(attribute, completed_url.as_str())
.map_err(|_| FullTextParserError::Scrape)?;
}
let completed_url = if is_relative_url {
article_url.join(trimmed_url)?
} else {
Url::parse(trimmed_url)?
};
node.set_attribute(attribute, completed_url.as_str())
.map_err(|_| FullTextParserError::Scrape)?;
}
}
Ok(())
@ -867,7 +871,7 @@ impl FullTextParser {
Util::clean_conditionally(&mut root, "ul");
Util::clean_conditionally(&mut root, "div");
Self::clean_classes(&mut root)?;
Self::clean_attributes(&mut root)?;
Self::simplify_nested_elements(&mut root)?;
}
@ -895,7 +899,7 @@ impl FullTextParser {
}
}
fn clean_classes(root: &mut Node) -> Result<(), FullTextParserError> {
fn clean_attributes(root: &mut Node) -> Result<(), FullTextParserError> {
let mut node_iter = Some(root.clone());
while let Some(mut node) = node_iter {
@ -904,6 +908,11 @@ impl FullTextParser {
FullTextParserError::Xml
})?;
node.remove_attribute("align").map_err(|e| {
log::error!("{e}");
FullTextParserError::Xml
})?;
node.remove_attribute(constants::SCORE_ATTR).map_err(|e| {
log::error!("{e}");
FullTextParserError::Xml
@ -915,6 +924,10 @@ impl FullTextParser {
FullTextParserError::Xml
})?;
if node.get_name().to_uppercase() == "FONT" {
node.set_name("span").unwrap();
}
node_iter = Util::next_node(&node, false);
}
Ok(())

View file

@ -126,6 +126,11 @@ async fn citylab_1() {
run_test("citylab-1").await
}
#[tokio::test]
async fn clean_links() {
run_test("clean-links").await
}
#[tokio::test]
async fn webmd_1() {
run_test("webmd-1").await