1
0
Fork 0
mirror of https://gitlab.com/news-flash/article_scraper.git synced 2025-07-07 16:15:32 +02:00
Commit graph

28 commits

Author SHA1 Message Date
Jan Lukas Gernert
90d45ea6b3 tmp: dont strip scripts 2020-01-27 16:36:32 +01:00
Jan Lukas Gernert
8247defe54 fix typo 2020-01-27 14:59:56 +01:00
Jan Lukas Gernert
9e995122c4 only strip topmost nodes in tree branches 2019-12-19 17:36:48 +01:00
Jan Lukas Gernert
26346839f2 remove prints 2019-11-19 19:28:49 +01:00
Jan Lukas Gernert
edfbca3cf3 fix document going out of scope 2019-11-19 14:41:08 +01:00
Jan Lukas Gernert
2c6bfed550 frickel 2019-11-18 05:53:34 +01:00
Jan Lukas Gernert
4b8af0d709 wip: async 2019-11-10 14:43:59 +01:00
Jan Lukas Gernert
5f82872d1f don't attempt to redownload embeded images 2019-09-26 21:48:24 +02:00
Jan Lukas Gernert
4f5aef8e17 merge 2019-09-26 21:29:11 +02:00
Jan Lukas Gernert
2137e84743 update to new serialization api of libxml 2019-09-26 21:28:05 +02:00
Jan Lukas Gernert
a99b8dec47 wip: test libxml XML_SAVE_NO_EMPTY option 2019-09-24 18:45:06 +02:00
Jan Lukas Gernert
a44ac3663c don't resize animated images 2019-09-24 14:25:57 +02:00
Jan Lukas Gernert
b489af74bd create data dir if it doesn't exist 2019-09-24 03:16:37 +02:00
Jan Lukas Gernert
481a2f41ac don't abort image download on failed image 2019-09-24 02:56:45 +02:00
Jan Lukas Gernert
f9905c8a9d download images parameter to parse method 2019-09-24 02:43:36 +02:00
Jan Lukas Gernert
f1be8a2608 make image downloader public 2019-09-24 02:40:43 +02:00
Jan Lukas Gernert
3ca59d7f02 embed images as base64 inside article html 2019-03-06 18:37:24 +01:00
Jan Lukas Gernert
e1905d3c2c remove life time annotations added by rust 2018 2018-12-08 23:25:07 +01:00
Jan Lukas Gernert
02356a51aa fix save_html returning error even it succeeded 2018-12-07 15:31:09 +01:00
Jan Lukas Gernert
aa26e099df get rid of 'extern crate' 2018-12-07 02:26:46 +01:00
Jan Lukas Gernert
b679f2e1fa update to rust 2018 2018-12-07 02:19:40 +01:00
Jan Lukas Gernert
6f38c2bc4c merge 2018-12-07 02:17:15 +01:00
Jan Lukas Gernert
5555118914 update to reqwest 0.9 2018-12-07 02:15:06 +01:00
Jan Lukas Gernert
fcea6cf5d1 update to reqwest 0.9 2018-12-07 02:14:50 +01:00
Jan Lukas Gernert
fab4306ed9 TIL: map_err 2018-08-31 16:49:58 +02:00
Jan Lukas Gernert
5beb25a575 remove old dependency 2018-08-29 18:25:30 +02:00
Jan Lukas Gernert
ed66a705c1 update deps and remove html2text 2018-08-02 17:05:26 +02:00
Jan Lukas Gernert
4b2e6a24eb initial commit 2018-07-31 16:10:09 +02:00