1
0
Fork 0
mirror of https://gitlab.com/news-flash/article_scraper.git synced 2025-07-07 16:15:32 +02:00
This commit is contained in:
Jan Lukas Gernert 2023-02-26 02:22:53 +01:00
parent d8e3a75b01
commit 0834c4d72a
8 changed files with 3234 additions and 489 deletions

View file

@ -1,9 +1,7 @@
<article><DIV id="readability-page-1" class="page"><section>
<p><strong>So finally you're <a href="/code/2013/testing-frontend-javascript-code-using-mocha-chai-and-sinon/" target="_blank">testing your frontend JavaScript code</a>? Great! The more you
<article><DIV id="readability-page-1" class="page"><section><p><strong>So finally you're <a href="/code/2013/testing-frontend-javascript-code-using-mocha-chai-and-sinon/" target="_blank">testing your frontend JavaScript code</a>? Great! The more you
write tests, the more confident you are with your code… but how much precisely?
That's where <a href="http://en.wikipedia.org/wiki/Code_coverage" target="_blank">code coverage</a> might
help.</strong>
</p>
help.</strong></p>
<p>The idea behind code coverage is to record which parts of your code (functions,
statements, conditionals and so on) have been executed by your test suite,
to compute metrics out of these data and usually to provide tools for navigating
@ -23,17 +21,14 @@ help.</strong>
</blockquote>
<p><strong><a href="http://blanketjs.org/" target="_blank">Blanket.js</a></strong> is an <em>easy to install, easy to configure,
and easy to use JavaScript code coverage library that works both in-browser and
with nodejs.</em>
</p>
with nodejs.</em></p>
<p>Its use is dead easy, adding Blanket support to your Mocha test suite
is just matter of adding this simple line to your HTML test file:</p>
<pre><code>&lt;script src="vendor/blanket.js"
data-cover-adapter="vendor/mocha-blanket.js"&gt;&lt;/script&gt;
</code></pre>
<p>Source files: <a href="https://raw.github.com/alex-seville/blanket/master/dist/qunit/blanket.min.js" target="_blank">blanket.js</a>,
<a href="https://raw.github.com/alex-seville/blanket/master/src/adapters/mocha-blanket.js" target="_blank">mocha-blanket.js</a>
</p>
<a href="https://raw.github.com/alex-seville/blanket/master/src/adapters/mocha-blanket.js" target="_blank">mocha-blanket.js</a></p>
<p>As an example, let's reuse the silly <code>Cow</code> example we used
<a href="/code/2013/testing-frontend-javascript-code-using-mocha-chai-and-sinon/" target="_blank">in a previous episode</a>:</p>
<pre><code>// cow.js
@ -54,7 +49,6 @@ with nodejs.</em>
};
})(this);
</code></pre>
<p>And its test suite, powered by Mocha and <a href="http://chaijs.com/" target="_blank">Chai</a>:</p>
<pre><code>var expect = chai.expect;
@ -79,7 +73,6 @@ describe("Cow", function() {
});
});
</code></pre>
<p>Let's create the HTML test file for it, featuring Blanket and its adapter
for Mocha:</p>
<pre><code>&lt;!DOCTYPE html&gt;
@ -104,7 +97,6 @@ describe("Cow", function() {
&lt;/body&gt;
&lt;/html&gt;
</code></pre>
<p><strong>Notes</strong>:</p>
<ul>
<li>Notice the <code>data-cover</code> attribute we added to the script tag
@ -113,9 +105,7 @@ describe("Cow", function() {
be loaded.</li>
</ul>
<p>Running the tests now gives us something like this:</p>
<p>
<img alt="screenshot" src="/static/code/2013/blanket-coverage.png">
</p>
<p><img alt="screenshot" src="/static/code/2013/blanket-coverage.png"></p>
<p>As you can see, the report at the bottom highlights that we haven't actually
tested the case where an error is raised in case a target name is missing.
We've been informed of that, nothing more, nothing less. We simply know
@ -127,6 +117,4 @@ describe("Cow", function() {
sessions
and <a href="http://alexgaynor.net/2013/sep/26/effective-code-review/" target="_blank">code reviews</a>
but that's another story.</p>
<p><strong>So is code coverage silver bullet? No. Is it useful? Definitely. Happy testing!</strong>
</p>
</section></DIV></article>
<p><strong>So is code coverage silver bullet? No. Is it useful? Definitely. Happy testing!</strong></p></section></DIV></article>

View file

@ -1,6 +1,4 @@
<article><DIV id="readability-page-1" class="page">
<article role="article">
<p>For more than a decade the Web has used XMLHttpRequest (XHR) to achieve
<article><DIV id="readability-page-1" class="page"><article role="article"><p>For more than a decade the Web has used XMLHttpRequest (XHR) to achieve
asynchronous requests in JavaScript. While very useful, XHR is not a very
nice API. It suffers from lack of separation of concerns. The input, output
and state are all managed by interacting with one object, and state is
@ -22,122 +20,49 @@
</ol>
<p>As of this writing, the Fetch API is available in Firefox 39 (currently
Nightly) and Chrome 42 (currently dev). Github has a <a href="https://github.com/github/fetch" target="_blank">Fetch polyfill</a>.</p>
<h2>Feature detection</h2>
<p>Fetch API support can be detected by checking for <code>Headers</code>,<code>Request</code>, <code>Response</code> or <code>fetch</code> on
the <code>window</code> or <code>worker</code> scope.</p>
<h2>Simple fetching</h2>
<p>The most useful, high-level part of the Fetch API is the <code>fetch()</code> function.
In its simplest form it takes a URL and returns a promise that resolves
to the response. The response is captured as a <code>Response</code> object.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>fetch<span>(</span><span>"/data.json"</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>res<span>)</span> <span>{</span>
<span>// res instanceof Response == true.</span>
<span>if</span> <span>(</span>res.<span>ok</span><span>)</span> <span>{</span>
<div><table><tbody><tr><td><pre>fetch<span>(</span><span>"/data.json"</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>res<span>)</span><span>{</span><span>// res instanceof Response == true.</span><span>if</span><span>(</span>res.<span>ok</span><span>)</span><span>{</span>
res.<span>json</span><span>(</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>data<span>)</span><span>{</span>
console.<span>log</span><span>(</span>data.<span>entries</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span>
<span>}</span> <span>else</span> <span>{</span>
console.<span>log</span><span>(</span><span>"Looks like the response wasn't perfect, got status"</span><span>,</span> res.<span>status</span><span>)</span><span>;</span>
<span>}</span>
<span>}</span><span>,</span> <span>function</span><span>(</span>e<span>)</span> <span>{</span>
console.<span>log</span><span>(</span><span>"Fetch failed!"</span><span>,</span> e<span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span>data.<span>entries</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span><span>}</span><span>else</span><span>{</span>
console.<span>log</span><span>(</span><span>"Looks like the response wasn't perfect, got status"</span><span>,</span> res.<span>status</span><span>)</span><span>;</span><span>}</span><span>}</span><span>,</span><span>function</span><span>(</span>e<span>)</span><span>{</span>
console.<span>log</span><span>(</span><span>"Fetch failed!"</span><span>,</span> e<span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>Submitting some parameters, it would look like this:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>fetch<span>(</span><span>"http://www.example.org/submit.php"</span><span>,</span> <span>{</span>
<div><table><tbody><tr><td><pre>fetch<span>(</span><span>"http://www.example.org/submit.php"</span><span>,</span><span>{</span>
method<span>:</span><span>"POST"</span><span>,</span>
headers<span>:</span> <span>{</span>
<span>"Content-Type"</span><span>:</span> <span>"application/x-www-form-urlencoded"</span>
<span>}</span><span>,</span>
body<span>:</span> <span>"firstName=Nikhil&amp;favColor=blue&amp;password=easytoguess"</span>
<span>}</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>res<span>)</span> <span>{</span>
<span>if</span> <span>(</span>res.<span>ok</span><span>)</span> <span>{</span>
alert<span>(</span><span>"Perfect! Your settings are saved."</span><span>)</span><span>;</span>
<span>}</span> <span>else</span> <span>if</span> <span>(</span>res.<span>status</span> <span>==</span> <span>401</span><span>)</span> <span>{</span>
alert<span>(</span><span>"Oops! You are not authorized."</span><span>)</span><span>;</span>
<span>}</span>
<span>}</span><span>,</span> <span>function</span><span>(</span>e<span>)</span> <span>{</span>
alert<span>(</span><span>"Error submitting form!"</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
headers<span>:</span><span>{</span><span>"Content-Type"</span><span>:</span><span>"application/x-www-form-urlencoded"</span><span>}</span><span>,</span>
body<span>:</span><span>"firstName=Nikhil&amp;favColor=blue&amp;password=easytoguess"</span><span>}</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>res<span>)</span><span>{</span><span>if</span><span>(</span>res.<span>ok</span><span>)</span><span>{</span>
alert<span>(</span><span>"Perfect! Your settings are saved."</span><span>)</span><span>;</span><span>}</span><span>else</span><span>if</span><span>(</span>res.<span>status</span><span>==</span><span>401</span><span>)</span><span>{</span>
alert<span>(</span><span>"Oops! You are not authorized."</span><span>)</span><span>;</span><span>}</span><span>}</span><span>,</span><span>function</span><span>(</span>e<span>)</span><span>{</span>
alert<span>(</span><span>"Error submitting form!"</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>The <code>fetch()</code> functions arguments are the same as those passed
to the
<br>
<code>Request()</code> constructor, so you may directly pass arbitrarily
<br><code>Request()</code> constructor, so you may directly pass arbitrarily
complex requests to <code>fetch()</code> as discussed below.</p>
<h2>Headers</h2>
<p>Fetch introduces 3 interfaces. These are <code>Headers</code>, <code>Request</code> and
<br>
<code>Response</code>. They map directly to the underlying HTTP concepts,
<br><code>Response</code>. They map directly to the underlying HTTP concepts,
but have
<br>certain visibility filters in place for privacy and security reasons,
such as
<br>supporting CORS rules and ensuring cookies arent readable by third parties.</p>
<p>The <a href="https://fetch.spec.whatwg.org/#headers-class" target="_blank">Headers interface</a> is
a simple multi-map of names to values:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> content <span>=</span> <span>"Hello World"</span><span>;</span>
<span>var</span> reqHeaders <span>=</span> <span>new</span> Headers<span>(</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> content <span>=</span><span>"Hello World"</span><span>;</span><span>var</span> reqHeaders <span>=</span><span>new</span> Headers<span>(</span><span>)</span><span>;</span>
reqHeaders.<span>append</span><span>(</span><span>"Content-Type"</span><span>,</span><span>"text/plain"</span>
reqHeaders.<span>append</span><span>(</span><span>"Content-Length"</span><span>,</span> content.<span>length</span>.<span>toString</span><span>(</span><span>)</span><span>)</span><span>;</span>
reqHeaders.<span>append</span><span>(</span><span>"X-Custom-Header"</span><span>,</span> <span>"ProcessThisImmediately"</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
reqHeaders.<span>append</span><span>(</span><span>"X-Custom-Header"</span><span>,</span><span>"ProcessThisImmediately"</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>The same can be achieved by passing an array of arrays or a JS object
literal
<br>to the constructor:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>reqHeaders <span>=</span> <span>new</span> Headers<span>(</span><span>{</span>
<span>"Content-Type"</span><span>:</span> <span>"text/plain"</span><span>,</span>
<span>"Content-Length"</span><span>:</span> content.<span>length</span>.<span>toString</span><span>(</span><span>)</span><span>,</span>
<span>"X-Custom-Header"</span><span>:</span> <span>"ProcessThisImmediately"</span><span>,</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
<div><table><tbody><tr><td><pre>reqHeaders <span>=</span><span>new</span> Headers<span>(</span><span>{</span><span>"Content-Type"</span><span>:</span><span>"text/plain"</span><span>,</span><span>"Content-Length"</span><span>:</span> content.<span>length</span>.<span>toString</span><span>(</span><span>)</span><span>,</span><span>"X-Custom-Header"</span><span>:</span><span>"ProcessThisImmediately"</span><span>,</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>The contents can be queried and retrieved:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>console.<span>log</span><span>(</span>reqHeaders.<span>has</span><span>(</span><span>"Content-Type"</span><span>)</span><span>)</span><span>;</span> <span>// true</span>
<div><table><tbody><tr><td><pre>console.<span>log</span><span>(</span>reqHeaders.<span>has</span><span>(</span><span>"Content-Type"</span><span>)</span><span>)</span><span>;</span><span>// true</span>
console.<span>log</span><span>(</span>reqHeaders.<span>has</span><span>(</span><span>"Set-Cookie"</span><span>)</span><span>)</span><span>;</span><span>// false</span>
reqHeaders.<span>set</span><span>(</span><span>"Content-Type"</span><span>,</span><span>"text/html"</span><span>)</span><span>;</span>
reqHeaders.<span>append</span><span>(</span><span>"X-Custom-Header"</span><span>,</span><span>"AnotherValue"</span><span>)</span><span>;</span>
@ -146,12 +71,7 @@ console.<span>log</span><span>(</span>reqHeaders.<span>get</span><span>(</span><
console.<span>log</span><span>(</span>reqHeaders.<span>getAll</span><span>(</span><span>"X-Custom-Header"</span><span>)</span><span>)</span><span>;</span><span>// ["ProcessThisImmediately", "AnotherValue"]</span>
 
reqHeaders.<span>delete</span><span>(</span><span>"X-Custom-Header"</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>reqHeaders.<span>getAll</span><span>(</span><span>"X-Custom-Header"</span><span>)</span><span>)</span><span>;</span> <span>// []</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span>reqHeaders.<span>getAll</span><span>(</span><span>"X-Custom-Header"</span><span>)</span><span>)</span><span>;</span><span>// []</span></pre></td></tr></tbody></table></div>
<p>Some of these operations are only useful in ServiceWorkers, but they provide
<br>a much nicer API to Headers.</p>
<p>Since Headers can be sent in requests, or received in responses, and have
@ -178,82 +98,34 @@ console.<span>log</span><span>(</span>reqHeaders.<span>getAll</span><span>(</spa
<p>All of the Headers methods throw TypeError if <code>name</code> is not a
<a href="https://fetch.spec.whatwg.org/#concept-header-name" target="_blank">valid HTTP Header name</a>. The mutation operations will throw TypeError
if there is an immutable guard. Otherwise they fail silently. For example:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> res <span>=</span> Response.<span>error</span><span>(</span><span>)</span><span>;</span>
<span>try</span> <span>{</span>
res.<span>headers</span>.<span>set</span><span>(</span><span>"Origin"</span><span>,</span> <span>"http://mybank.com"</span><span>)</span><span>;</span>
<span>}</span> <span>catch</span><span>(</span>e<span>)</span> <span>{</span>
console.<span>log</span><span>(</span><span>"Cannot pretend to be a bank!"</span><span>)</span><span>;</span>
<span>}</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
<div><table><tbody><tr><td><pre><span>var</span> res <span>=</span> Response.<span>error</span><span>(</span><span>)</span><span>;</span><span>try</span><span>{</span>
res.<span>headers</span>.<span>set</span><span>(</span><span>"Origin"</span><span>,</span><span>"http://mybank.com"</span><span>)</span><span>;</span><span>}</span><span>catch</span><span>(</span>e<span>)</span><span>{</span>
console.<span>log</span><span>(</span><span>"Cannot pretend to be a bank!"</span><span>)</span><span>;</span><span>}</span></pre></td></tr></tbody></table></div>
<h2>Request</h2>
<p>The Request interface defines a request to fetch a resource over HTTP.
URL, method and headers are expected, but the Request also allows specifying
a body, a request mode, credentials and cache hints.</p>
<p>The simplest Request is of course, just a URL, as you may do to GET a
resource.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> req <span>=</span> <span>new</span> Request<span>(</span><span>"/index.html"</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> req <span>=</span><span>new</span> Request<span>(</span><span>"/index.html"</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>req.<span>method</span><span>)</span><span>;</span><span>// "GET"</span>
console.<span>log</span><span>(</span>req.<span>url</span><span>)</span><span>;</span> <span>// "http://example.com/index.html"</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span>req.<span>url</span><span>)</span><span>;</span><span>// "http://example.com/index.html"</span></pre></td></tr></tbody></table></div>
<p>You may also pass a Request to the <code>Request()</code> constructor to
create a copy.
<br>(This is not the same as calling the <code>clone()</code> method, which
is covered in
<br>the “Reading bodies” section.).</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> copy <span>=</span> <span>new</span> Request<span>(</span>req<span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> copy <span>=</span><span>new</span> Request<span>(</span>req<span>)</span><span>;</span>
console.<span>log</span><span>(</span>copy.<span>method</span><span>)</span><span>;</span><span>// "GET"</span>
console.<span>log</span><span>(</span>copy.<span>url</span><span>)</span><span>;</span> <span>// "http://example.com/index.html"</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span>copy.<span>url</span><span>)</span><span>;</span><span>// "http://example.com/index.html"</span></pre></td></tr></tbody></table></div>
<p>Again, this form is probably only useful in ServiceWorkers.</p>
<p>The non-URL attributes of the <code>Request</code> can only be set by passing
initial
<br>values as a second argument to the constructor. This argument is a dictionary.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> uploadReq <span>=</span> <span>new</span> Request<span>(</span><span>"/uploadImage"</span><span>,</span> <span>{</span>
<div><table><tbody><tr><td><pre><span>var</span> uploadReq <span>=</span><span>new</span> Request<span>(</span><span>"/uploadImage"</span><span>,</span><span>{</span>
method<span>:</span><span>"POST"</span><span>,</span>
headers<span>:</span> <span>{</span>
<span>"Content-Type"</span><span>:</span> <span>"image/png"</span><span>,</span>
<span>}</span><span>,</span>
body<span>:</span> <span>"image data"</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
headers<span>:</span><span>{</span><span>"Content-Type"</span><span>:</span><span>"image/png"</span><span>,</span><span>}</span><span>,</span>
body<span>:</span><span>"image data"</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>The Requests mode is used to determine if cross-origin requests lead
to valid responses, and which properties on the response are readable.
Legal mode values are <code>"same-origin"</code>, <code>"no-cors"</code> (default)
@ -262,22 +134,10 @@ console.<span>log</span><span>(</span>copy.<span>url</span><span>)</span><span>;
origin with this mode set, the result is simply an error. You could use
this to ensure that
<br>a request is always being made to your origin.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> arbitraryUrl <span>=</span> document.<span>getElementById</span><span>(</span><span>"url-input"</span><span>)</span>.<span>value</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> arbitraryUrl <span>=</span> document.<span>getElementById</span><span>(</span><span>"url-input"</span><span>)</span>.<span>value</span><span>;</span>
fetch<span>(</span>arbitraryUrl<span>,</span><span>{</span> mode<span>:</span><span>"same-origin"</span><span>}</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>res<span>)</span><span>{</span>
console.<span>log</span><span>(</span><span>"Response succeeded?"</span><span>,</span> res.<span>ok</span><span>)</span><span>;</span>
<span>}</span><span>,</span> <span>function</span><span>(</span>e<span>)</span> <span>{</span>
console.<span>log</span><span>(</span><span>"Please enter a same-origin URL!"</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span><span>"Response succeeded?"</span><span>,</span> res.<span>ok</span><span>)</span><span>;</span><span>}</span><span>,</span><span>function</span><span>(</span>e<span>)</span><span>{</span>
console.<span>log</span><span>(</span><span>"Please enter a same-origin URL!"</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>The <code>"no-cors"</code> mode captures what the web platform does by default
for scripts you import from CDNs, images hosted on other domains, and so
on. First, it prevents the method from being anything other than “HEAD”,
@ -295,53 +155,22 @@ fetch<span>(</span>arbitraryUrl<span>,</span> <span>{</span> mode<span>:</span>
headers is exposed in the Response, but the body is readable. For example,
you could get a list of Flickrs <a href="https://www.flickr.com/services/api/flickr.interestingness.getList.html" target="_blank">most interesting</a> photos
today like this:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> u <span>=</span> <span>new</span> URLSearchParams<span>(</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> u <span>=</span><span>new</span> URLSearchParams<span>(</span><span>)</span><span>;</span>
u.<span>append</span><span>(</span><span>'method'</span><span>,</span><span>'flickr.interestingness.getList'</span><span>)</span><span>;</span>
u.<span>append</span><span>(</span><span>'api_key'</span><span>,</span><span>'&lt;insert api key here&gt;'</span><span>)</span><span>;</span>
u.<span>append</span><span>(</span><span>'format'</span><span>,</span><span>'json'</span><span>)</span><span>;</span>
u.<span>append</span><span>(</span><span>'nojsoncallback'</span><span>,</span> <span>'1'</span><span>)</span><span>;</span>
u.<span>append</span><span>(</span><span>'nojsoncallback'</span><span>,</span><span>'1'</span><span>)</span><span>;</span><span>var</span> apiCall <span>=</span> fetch<span>(</span><span>'https://api.flickr.com/services/rest?'</span><span>+</span> u<span>)</span><span>;</span>
 
<span>var</span> apiCall <span>=</span> fetch<span>(</span><span>'https://api.flickr.com/services/rest?'</span> <span>+</span> u<span>)</span><span>;</span>
 
apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>response<span>)</span> <span>{</span>
<span>return</span> response.<span>json</span><span>(</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>json<span>)</span> <span>{</span>
<span>// photo is a list of photos.</span>
<span>return</span> json.<span>photos</span>.<span>photo</span><span>;</span>
<span>}</span><span>)</span><span>;</span>
<span>}</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>photos<span>)</span> <span>{</span>
apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>response<span>)</span><span>{</span><span>return</span> response.<span>json</span><span>(</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>json<span>)</span><span>{</span><span>// photo is a list of photos.</span><span>return</span> json.<span>photos</span>.<span>photo</span><span>;</span><span>}</span><span>)</span><span>;</span><span>}</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>photos<span>)</span><span>{</span>
photos.<span>forEach</span><span>(</span><span>function</span><span>(</span>photo<span>)</span><span>{</span>
console.<span>log</span><span>(</span>photo.<span>title</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span>photo.<span>title</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>You may not read out the “Date” header since Flickr does not allow it
via
<br>
<code>Access-Control-Expose-Headers</code>.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>response.<span>headers</span>.<span>get</span><span>(</span><span>"Date"</span><span>)</span><span>;</span> <span>// null</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
<br><code>Access-Control-Expose-Headers</code>.</p>
<div><table><tbody><tr><td><pre>response.<span>headers</span>.<span>get</span><span>(</span><span>"Date"</span><span>)</span><span>;</span><span>// null</span></pre></td></tr></tbody></table></div>
<p>The <code>credentials</code> enumeration determines if cookies for the other
domain are
<br>sent to cross-origin requests. This is similar to XHRs <code>withCredentials</code>
<br>flag, but tri-valued as <code>"omit"</code> (default), <code>"same-origin"</code> and <code>"include"</code>.</p>
<br>sent to cross-origin requests. This is similar to XHRs <code>withCredentials</code><br>flag, but tri-valued as <code>"omit"</code> (default), <code>"same-origin"</code> and <code>"include"</code>.</p>
<p>The Request object will also give the ability to offer caching hints to
the user-agent. This is currently undergoing some <a href="https://github.com/slightlyoff/ServiceWorker/issues/585" target="_blank">security review</a>.
Firefox exposes the attribute, but it has no effect.</p>
@ -349,16 +178,13 @@ apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>respon
<br>intercepting them. There is the string <code>referrer</code>, which is
set by the UA to be
<br>the referrer of the Request. This may be an empty string. The other is
<br>
<code>context</code> which is a rather <a href="https://fetch.spec.whatwg.org/#requestcredentials" target="_blank">large enumeration</a> defining
<br><code>context</code> which is a rather <a href="https://fetch.spec.whatwg.org/#requestcredentials" target="_blank">large enumeration</a> defining
what sort of resource is being fetched. This could be “image” if the request
is from an
&lt;img&gt;tag in the controlled document, “worker” if it is an attempt to load a
worker script, and so on. When used with the <code>fetch()</code> function,
it is “fetch”.</p>
<h2>Response</h2>
<p><code>Response</code> instances are returned by calls to <code>fetch()</code>.
They can also be created by JS, but this is only useful in ServiceWorkers.</p>
<p>We have already seen some attributes of Response when we looked at <code>fetch()</code>.
@ -396,21 +222,9 @@ apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>respon
The
<br>idiomatic way to return a Response to an intercepted request in ServiceWorkers
is:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>addEventListener<span>(</span><span>'fetch'</span><span>,</span> <span>function</span><span>(</span>event<span>)</span> <span>{</span>
<div><table><tbody><tr><td><pre>addEventListener<span>(</span><span>'fetch'</span><span>,</span><span>function</span><span>(</span>event<span>)</span><span>{</span>
event.<span>respondWith</span><span>(</span><span>new</span> Response<span>(</span><span>"Response body"</span><span>,</span><span>{</span>
headers<span>:</span> <span>{</span> <span>"Content-Type"</span> <span>:</span> <span>"text/plain"</span> <span>}</span>
<span>}</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
headers<span>:</span><span>{</span><span>"Content-Type"</span><span>:</span><span>"text/plain"</span><span>}</span><span>}</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>As you can see, Response has a two argument constructor, where both arguments
are optional. The first argument is a body initializer, and the second
is a dictionary to set the <code>status</code>, <code>statusText</code> and <code>headers</code>.</p>
@ -418,17 +232,13 @@ apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>respon
response. Similarly, <code>Response.redirect(url, status)</code> returns
a Response resulting in
<br>a redirect to <code>url</code>.</p>
<h2>Dealing with bodies</h2>
<p>Both Requests and Responses may contain body data. Weve been glossing
over it because of the various data types body may contain, but we will
cover it in detail now.</p>
<p>A body is an instance of any of the following types.</p>
<ul>
<li>
<a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer" target="_blank">ArrayBuffer</a>
</li>
<li><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer" target="_blank">ArrayBuffer</a></li>
<li>
<a href="https://developer.mozilla.org/en-US/docs/Web/API/ArrayBufferView" target="_blank">ArrayBufferView</a> (Uint8Array
and friends)</li>
@ -437,9 +247,7 @@ apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>respon
<a href="https://developer.mozilla.org/en-US/docs/Web/API/File" target="_blank">File</a>
</li>
<li>string</li>
<li>
<a href="https://url.spec.whatwg.org/#interface-urlsearchparams" target="_blank">URLSearchParams</a>
</li>
<li><a href="https://url.spec.whatwg.org/#interface-urlsearchparams" target="_blank">URLSearchParams</a></li>
<li>
<a href="https://developer.mozilla.org/en-US/docs/Web/API/FormData" target="_blank">FormData</a>
currently not supported by either Gecko or Blink. Firefox expects to ship
@ -449,83 +257,38 @@ apiCall.<span>then</span><span>(</span><span>function</span><span>(</span>respon
extract their body. These all return a Promise that is eventually resolved
with the actual content.</p>
<ul>
<li>
<code>arrayBuffer()</code>
</li>
<li>
<code>blob()</code>
</li>
<li>
<code>json()</code>
</li>
<li>
<code>text()</code>
</li>
<li>
<code>formData()</code>
</li>
<li><code>arrayBuffer()</code></li>
<li><code>blob()</code></li>
<li><code>json()</code></li>
<li><code>text()</code></li>
<li><code>formData()</code></li>
</ul>
<p>This is a significant improvement over XHR in terms of ease of use of
non-text data!</p>
<p>Request bodies can be set by passing <code>body</code> parameters:</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> form <span>=</span> <span>new</span> FormData<span>(</span>document.<span>getElementById</span><span>(</span><span>'login-form'</span><span>)</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> form <span>=</span><span>new</span> FormData<span>(</span>document.<span>getElementById</span><span>(</span><span>'login-form'</span><span>)</span><span>)</span><span>;</span>
fetch<span>(</span><span>"/login"</span><span>,</span><span>{</span>
method<span>:</span><span>"POST"</span><span>,</span>
body<span>:</span> form
<span>}</span><span>)</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
<span>}</span><span>)</span></pre></td></tr></tbody></table></div>
<p>Responses take the first argument as the body.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> res <span>=</span> <span>new</span> Response<span>(</span><span>new</span> File<span>(</span><span>[</span><span>"chunk"</span><span>,</span> <span>"chunk"</span><span>]</span><span>,</span> <span>"archive.zip"</span><span>,</span>
<span>{</span> type<span>:</span> <span>"application/zip"</span> <span>}</span><span>)</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
<div><table><tbody><tr><td><pre><span>var</span> res <span>=</span><span>new</span> Response<span>(</span><span>new</span> File<span>(</span><span>[</span><span>"chunk"</span><span>,</span><span>"chunk"</span><span>]</span><span>,</span><span>"archive.zip"</span><span>,</span><span>{</span> type<span>:</span><span>"application/zip"</span><span>}</span><span>)</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>Both Request and Response (and by extension the <code>fetch()</code> function),
will try to intelligently <a href="https://fetch.spec.whatwg.org/#concept-bodyinit-extract" target="_blank">determine the content type</a>.
Request will also automatically set a “Content-Type” header if none is
set in the dictionary.</p>
<h3>Streams and cloning</h3>
<p>It is important to realise that Request and Response bodies can only be
read once! Both interfaces have a boolean attribute <code>bodyUsed</code> to
determine if it is safe to read or not.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre><span>var</span> res <span>=</span> <span>new</span> Response<span>(</span><span>"one time use"</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre><span>var</span> res <span>=</span><span>new</span> Response<span>(</span><span>"one time use"</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>res.<span>bodyUsed</span><span>)</span><span>;</span><span>// false</span>
res.<span>text</span><span>(</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>v<span>)</span><span>{</span>
console.<span>log</span><span>(</span>res.<span>bodyUsed</span><span>)</span><span>;</span> <span>// true</span>
<span>}</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>res.<span>bodyUsed</span><span>)</span><span>;</span><span>// true</span><span>}</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>res.<span>bodyUsed</span><span>)</span><span>;</span><span>// true</span>
 
res.<span>text</span><span>(</span><span>)</span>.<span>catch</span><span>(</span><span>function</span><span>(</span>e<span>)</span><span>{</span>
console.<span>log</span><span>(</span><span>"Tried to read already consumed Response"</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
console.<span>log</span><span>(</span><span>"Tried to read already consumed Response"</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<p>This decision allows easing the transition to an eventual <a href="https://streams.spec.whatwg.org/" target="_blank">stream-based</a> Fetch
API. The intention is to let applications consume data as it arrives, allowing
for JavaScript to deal with larger files like videos, and perform things
@ -539,33 +302,16 @@ res.<span>text</span><span>(</span><span>)</span>.<span>catch</span><span>(</spa
will return a clone of the object, with a new body. <code>clone()</code> MUST
be called before the body of the corresponding object has been used. That
is, <code>clone()</code> first, read later.</p>
<P>
<table>
<tbody>
<tr>
<td>
<pre>addEventListener<span>(</span><span>'fetch'</span><span>,</span> <span>function</span><span>(</span>evt<span>)</span> <span>{</span>
<span>var</span> sheep <span>=</span> <span>new</span> Response<span>(</span><span>"Dolly"</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>sheep.<span>bodyUsed</span><span>)</span><span>;</span> <span>// false</span>
<span>var</span> clone <span>=</span> sheep.<span>clone</span><span>(</span><span>)</span><span>;</span>
<div><table><tbody><tr><td><pre>addEventListener<span>(</span><span>'fetch'</span><span>,</span><span>function</span><span>(</span>evt<span>)</span><span>{</span><span>var</span> sheep <span>=</span><span>new</span> Response<span>(</span><span>"Dolly"</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>sheep.<span>bodyUsed</span><span>)</span><span>;</span><span>// false</span><span>var</span> clone <span>=</span> sheep.<span>clone</span><span>(</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>clone.<span>bodyUsed</span><span>)</span><span>;</span><span>// false</span>
 
clone.<span>text</span><span>(</span><span>)</span><span>;</span>
console.<span>log</span><span>(</span>sheep.<span>bodyUsed</span><span>)</span><span>;</span><span>// false</span>
console.<span>log</span><span>(</span>clone.<span>bodyUsed</span><span>)</span><span>;</span><span>// true</span>
 
evt.<span>respondWith</span><span>(</span>cache.<span>add</span><span>(</span>sheep.<span>clone</span><span>(</span><span>)</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>e<span>)</span> <span>{</span>
<span>return</span> sheep<span>;</span>
<span>}</span><span>)</span><span>;</span>
<span>}</span><span>)</span><span>;</span></pre>
</td>
</tr>
</tbody>
</table>
</P>
evt.<span>respondWith</span><span>(</span>cache.<span>add</span><span>(</span>sheep.<span>clone</span><span>(</span><span>)</span><span>)</span>.<span>then</span><span>(</span><span>function</span><span>(</span>e<span>)</span><span>{</span><span>return</span> sheep<span>;</span><span>}</span><span>)</span><span>;</span><span>}</span><span>)</span><span>;</span></pre></td></tr></tbody></table></div>
<h2>Future improvements</h2>
<p>Along with the transition to streams, Fetch will eventually have the ability
to abort running <code>fetch()</code>es and some way to report the progress
of a fetch. These are provided by XHR, but are a little tricky to fit in
@ -576,19 +322,8 @@ res.<span>text</span><span>(</span><span>)</span>.<span>catch</span><span>(</spa
<a href="https://github.com/slightlyoff/ServiceWorker/issues" target="_blank">ServiceWorker</a>specifications.</p>
<p>For a better web!</p>
<p><em>The author would like to thank Andrea Marchesini, Anne van Kesteren and Ben<br>
Kelly for helping with the specification and implementation.</em>
</p>
<footer>
<p>Posted by <a href="https://hacks.mozilla.org/author/nmarathemozilla-com/" title="Posts by Nikhil Marathe" rel="author" target="_blank">Nikhil Marathe</a>
Kelly for helping with the specification and implementation.</em></p>
<footer><p>Posted by <a href="https://hacks.mozilla.org/author/nmarathemozilla-com/" title="Posts by Nikhil Marathe" rel="author" target="_blank">Nikhil Marathe</a>
on
<time datetime="2015-03-10T08:05:41-07:00">March 10, 2015</time>at
<time datetime="PDT08:05:41-07:00">08:05</time>
</p>
<P>
</P>
</footer>
</article>
</DIV></article>
<time datetime="PDT08:05:41-07:00">08:05</time></p></footer></article></DIV></article>

View file

@ -0,0 +1,68 @@
<article><DIV id="readability-page-1" class="page">
<div id="readspeaker_area"><P id="readspeaker_button1"></P></div>
<div id="textArea">
<P></P>
<p></p>
<p>Feb. 23, 2015 -- Life-threatening peanut allergies have mysteriously
been
on the rise in the past decade, with little hope for a cure.</p>
<p xmlns:xalan="http://xml.apache.org/xalan">But a groundbreaking new
study may offer a way to stem that rise, while
another may offer some hope for those who are already allergic.</p>
<p>Parents have been told for years to avoid giving foods containing
peanuts
to babies for fear of triggering an allergy. Now research shows the
opposite
is true: Feeding babies snacks made with peanuts before their first
birthday
appears to prevent that from happening.</p>
<p>The study is published in the <i>New England Journal of Medicine,</i>
and
it was presented at the annual meeting of the American Academy of
Allergy,
Asthma and Immunology in Houston. It found that among children at
high
risk for getting peanut allergies, eating peanut snacks by 11 months
of
age and continuing to eat them at least three times a week until age
5
cut their chances of becoming allergic by more than 80% compared to
kids
who avoided peanuts. Those at high risk were already allergic to
egg, they
had the skin condition <a href="http://www.webmd.com/skin-problems-and-treatments/eczema/default.htm" target="_blank">eczema</a>, or
both.</p>
<p>Overall, about 3% of kids who ate peanut butter or peanut snacks
before
their first birthday got an allergy, compared to about 17% of kids
who
didnt eat them.</p>
<p>“I think this study is an astounding and groundbreaking study,
really,”
says Katie Allen, MD, PhD. She's the director of the Center for Food
and
Allergy Research at the Murdoch Childrens Research Institute in
Melbourne,
Australia. Allen was not involved in the research.</p>
<p>Experts say the research should shift thinking about how kids develop
<a href="http://www.webmd.com/allergies/guide/food-allergy-intolerances" target="_blank">food
allergies</a>, and it should change the guidance doctors give to
parents.
</p>
<p>Meanwhile, for children and adults who are already <a href="http://www.webmd.com/allergies/guide/nut-allergy" target="_blank">allergic to peanuts</a>,
another study presented at the same meeting held out hope of a
treatment.</p>
<p>A new skin patch called Viaskin allowed people with peanut allergies
to
eat tiny amounts of peanuts after they wore it for a year.</p>
<p></p>
<h3>A Change in Guidelines?</h3>
<p>Allergies to peanuts and other foods are on the rise. In the U.S.,
more
than 2% of people react to peanuts, a 400% increase since 1997. And
reactions
to peanuts and other tree nuts can be especially severe. Nuts are
the main
reason people get a life-threatening problem called <a href="http://www.webmd.com/allergies/guide/anaphylaxis" target="_blank">anaphylaxis</a>.</p>
</div>
</DIV></article>

File diff suppressed because it is too large Load diff

View file

@ -40,7 +40,7 @@ impl FullTextParser {
url: &url::Url,
client: &Client,
) -> Result<Article, FullTextParserError> {
libxml::tree::node::set_node_rc_guard(3);
libxml::tree::node::set_node_rc_guard(4);
info!("Scraping article: '{}'", url.as_str());

View file

@ -32,6 +32,12 @@ impl Readability {
while let Some(node_ref) = node.as_mut() {
let tag_name = node_ref.get_name().to_uppercase();
if tag_name == "TEXT" && node_ref.get_content().trim().is_empty() {
node = Util::remove_and_next(node_ref);
continue;
}
let match_string = node_ref
.get_class_names()
.iter()
@ -107,16 +113,12 @@ impl Readability {
for mut child_node in node_ref.get_child_nodes().into_iter() {
if Self::is_phrasing_content(&child_node) {
if let Some(p) = p.as_mut() {
child_node.unlink();
let _ = p.add_child(&mut child_node);
} else if !Util::is_whitespace(&child_node) {
child_node.unlink();
let mut new_node = Node::new("p", None, &document)
.map_err(|()| FullTextParserError::Readability)?;
node_ref
.replace_child_node(new_node.clone(), child_node.clone())
.map_err(|error| {
log::error!("{error}");
FullTextParserError::Readability
})?;
new_node.add_child(&mut child_node).map_err(|error| {
log::error!("{error}");
FullTextParserError::Readability
@ -247,6 +249,9 @@ impl Readability {
});
let top_candidates = candidates.into_iter().take(5).collect::<Vec<_>>();
// for candidate in top_candidates.iter() {
// println!("candidate: {} {:?}", candidate.get_name(), candidate.get_attributes());
// }
let mut needed_to_create_top_candidate = false;
let mut top_candidate = top_candidates.first().cloned().unwrap_or_else(|| {
// If we still have no top candidate, just use the body as a last resort.
@ -619,12 +624,8 @@ impl Readability {
is_text_node
|| constants::PHRASING_ELEMS.contains(&tag_name.as_str())
|| (tag_name == "A" || tag_name == "DEL" || tag_name == "INS")
&& node
.get_child_nodes()
.iter()
.map(Self::is_phrasing_content)
.all(|val| val)
|| ((tag_name == "A" || tag_name == "DEL" || tag_name == "INS")
&& node.get_child_nodes().iter().all(Self::is_phrasing_content))
}
// Initialize a node with the readability object. Also checks the

View file

@ -7,7 +7,7 @@ use crate::{
};
async fn run_test(name: &str) {
libxml::tree::node::set_node_rc_guard(3);
libxml::tree::node::set_node_rc_guard(4);
let _ = env_logger::builder().is_test(true).try_init();
let empty_config = ConfigEntry::default();
@ -43,22 +43,27 @@ async fn run_test(name: &str) {
article.document = Some(article_document);
let html = article.get_content().unwrap();
//std::fs::write("expected.html", &html).unwrap();
let expected = std::fs::read_to_string(format!(
"./resources/tests/readability/{name}/expected.html"
))
.expect("Failed to read expected HTML");
//std::fs::write("expected.html", &html).unwrap();
assert_eq!(expected, html);
}
#[tokio::test(flavor = "current_thread")]
#[tokio::test]
async fn test_001() {
run_test("001").await
}
#[tokio::test(flavor = "current_thread")]
#[tokio::test]
async fn test_002() {
run_test("002").await
}
#[tokio::test]
async fn webmd_1() {
run_test("webmd-1").await
}

View file

@ -360,11 +360,11 @@ impl Util {
pub fn has_single_tag_inside_element(node: &Node, tag: &str) -> bool {
// There should be exactly 1 element child with given tag
if node.get_child_nodes().len() == 1
if node.get_child_nodes().len() != 1
|| node
.get_child_nodes()
.first()
.map(|n| n.get_name().to_uppercase() == tag)
.map(|n| n.get_name().to_uppercase() != tag)
.unwrap_or(false)
{
return false;
@ -438,8 +438,8 @@ impl Util {
// Determine whether element has any children block level elements.
pub fn has_child_block_element(node: &Node) -> bool {
node.get_child_elements().iter().any(|node| {
constants::DIV_TO_P_ELEMS.contains(node.get_name().as_str())
node.get_child_nodes().iter().any(|node| {
constants::DIV_TO_P_ELEMS.contains(node.get_name().to_uppercase().as_str())
|| Self::has_child_block_element(node)
})
}