Jan Lukas Gernert
|
65d9296786
|
(cargo-release) version 1.1.4
|
2020-06-07 13:22:00 +02:00 |
|
Jan Lukas Gernert
|
6b6c52f315
|
only use builtin youtube parsing if no config is provided
|
2020-06-07 13:21:53 +02:00 |
|
Jan Lukas Gernert
|
34eaf1eeb1
|
fmt
|
2020-06-07 12:53:33 +02:00 |
|
Jan Lukas Gernert
|
66de57731d
|
(cargo-release) start next development iteration 1.1.4-alpha.0
|
2020-06-07 12:40:43 +02:00 |
|
Jan Lukas Gernert
|
de26d62e5f
|
(cargo-release) version 1.1.3
|
2020-06-07 12:39:57 +02:00 |
|
Jan Lukas Gernert
|
82a0a46323
|
special handling for youtube videos
|
2020-06-07 12:39:44 +02:00 |
|
Jan Lukas Gernert
|
a871d5b82e
|
(cargo-release) start next development iteration 1.1.3-alpha.0
|
2020-06-06 05:19:26 +02:00 |
|
Jan Lukas Gernert
|
210601eaff
|
(cargo-release) version 1.1.2
|
2020-06-06 05:18:38 +02:00 |
|
Jan Lukas Gernert
|
a42ececb2a
|
check if final url differs from original even without redirect status
|
2020-06-06 05:18:25 +02:00 |
|
Jan Lukas Gernert
|
3bb8485f40
|
Merge branch 'fmt+lint' into 'master'
fix fmt+lint
See merge request news-flash/article_scraper!5
|
2020-05-31 03:04:48 +00:00 |
|
Felix Buehler
|
fa54b82e52
|
[ci] add fmt + lint checking
|
2020-05-30 13:07:10 +02:00 |
|
Felix Buehler
|
0c3946dd5b
|
fix fmt+lint
|
2020-05-29 18:55:00 +02:00 |
|
Jan Lukas Gernert
|
7c9a512a34
|
(cargo-release) start next development iteration 1.1.2-alpha.0
|
2020-05-23 14:15:43 +02:00 |
|
Jan Lukas Gernert
|
1552967462
|
(cargo-release) version 1.1.1
|
2020-05-23 14:14:50 +02:00 |
|
Jan Lukas Gernert
|
f78cccf2a2
|
remove unused htmlescaper
|
2020-05-23 14:14:36 +02:00 |
|
Jan Lukas Gernert
|
9976eb9123
|
(cargo-release) start next development iteration 1.1.1-alpha.0
|
2020-05-20 16:34:38 +02:00 |
|
Jan Lukas Gernert
|
f51605a92c
|
naivedatetime -> datetime utc
|
2020-05-20 16:33:40 +02:00 |
|
Jan Lukas Gernert
|
8f48b69161
|
remove unneeded files
|
2020-04-28 03:07:21 +02:00 |
|
Jan Lukas Gernert
|
1fd7173eac
|
update for newer deps
|
2020-04-28 02:51:30 +02:00 |
|
Jan Lukas Gernert
|
1fbce6413d
|
Merge branch 'master' of gitlab.com:news-flash/article_scraper
|
2020-04-28 02:34:24 +02:00 |
|
Jan Lukas Gernert
|
f6d021b67b
|
first release
|
2020-04-28 02:33:25 +02:00 |
|
Jan Lukas Gernert
|
d2960d8539
|
require client for parsing
|
2020-02-10 18:01:35 +01:00 |
|
Jan Lukas Gernert
|
a7c247549a
|
remve unused crate
|
2020-02-06 21:08:58 +01:00 |
|
Jan Lukas Gernert
|
1ecc0fc4b4
|
option to set custom reqwest client
|
2020-02-03 17:46:54 +01:00 |
|
Jan Lukas Gernert
|
71055eed1c
|
fix corrupt filename
|
2020-01-27 17:32:17 +01:00 |
|
Jan Lukas Gernert
|
98348b7e59
|
tmp: dont strip scripts
|
2020-01-27 16:47:13 +01:00 |
|
Jan Lukas Gernert
|
23514aff9e
|
less dramatic logging
|
2020-01-27 02:03:06 +01:00 |
|
Jan Lukas Gernert
|
afe661fe6c
|
only go for single page link if xpath res isn't empty
|
2020-01-27 01:54:37 +01:00 |
|
Jan Lukas Gernert
|
e58acf828c
|
improve logging clearity
|
2020-01-27 01:48:54 +01:00 |
|
Jan Lukas Gernert
|
c720dbc299
|
fixup
|
2020-01-27 01:35:15 +01:00 |
|
Jan Lukas Gernert
|
b272c99911
|
fix missing '/' in url completion
|
2020-01-27 01:21:21 +01:00 |
|
Jan Lukas Gernert
|
f570873aba
|
load config files in background thread
|
2020-01-26 21:44:26 +01:00 |
|
Jan Lukas Gernert
|
2cac8a2678
|
got back to stable libxml
|
2020-01-26 17:34:47 +01:00 |
|
Jan Lukas Gernert
|
8025e8f004
|
Merge branch 'async' into 'master'
Async
See merge request news-flash/article_scraper!3
|
2020-01-19 21:15:13 +00:00 |
|
Jan Lukas Gernert
|
d9c7ef1471
|
Merge branch 'master' into 'async'
# Conflicts:
# Cargo.toml
# src/images/mod.rs
# src/lib.rs
|
2020-01-19 21:15:08 +00:00 |
|
Jan Lukas Gernert
|
d843809437
|
update reqwest to stable
|
2020-01-18 19:06:53 +01:00 |
|
Jan Lukas Gernert
|
9e995122c4
|
only strip topmost nodes in tree branches
|
2019-12-19 17:36:48 +01:00 |
|
Jan Lukas Gernert
|
b032ec99bc
|
Merge branch 'async' into 'master'
Async
See merge request news-flash/article_scraper!2
|
2019-12-16 11:37:00 +00:00 |
|
Jan Lukas Gernert
|
9c35fb9fa8
|
Async
|
2019-12-16 11:36:59 +00:00 |
|
Jan Lukas Gernert
|
26346839f2
|
remove prints
|
2019-11-19 19:28:49 +01:00 |
|
Jan Lukas Gernert
|
edfbca3cf3
|
fix document going out of scope
|
2019-11-19 14:41:08 +01:00 |
|
Jan Lukas Gernert
|
2c6bfed550
|
frickel
|
2019-11-18 05:53:34 +01:00 |
|
Jan Lukas Gernert
|
4b8af0d709
|
wip: async
|
2019-11-10 14:43:59 +01:00 |
|
Jan Lukas Gernert
|
5f82872d1f
|
don't attempt to redownload embeded images
|
2019-09-26 21:48:24 +02:00 |
|
Jan Lukas Gernert
|
4f5aef8e17
|
merge
|
2019-09-26 21:29:11 +02:00 |
|
Jan Lukas Gernert
|
2137e84743
|
update to new serialization api of libxml
|
2019-09-26 21:28:05 +02:00 |
|
Jan Lukas Gernert
|
a99b8dec47
|
wip: test libxml XML_SAVE_NO_EMPTY option
|
2019-09-24 18:45:06 +02:00 |
|
Jan Lukas Gernert
|
a44ac3663c
|
don't resize animated images
|
2019-09-24 14:25:57 +02:00 |
|
Jan Lukas Gernert
|
b489af74bd
|
create data dir if it doesn't exist
|
2019-09-24 03:16:37 +02:00 |
|
Jan Lukas Gernert
|
481a2f41ac
|
don't abort image download on failed image
|
2019-09-24 02:56:45 +02:00 |
|