What is this?
Web update checker.
- Simple CLI application that generates a HTML file which shows a list of update information.
- Suppress false detection caused by dynamical pages (e.g. Ads) without any black-listed words.
- Extract updated parts of contents.
- Reduce network traffic to use requests with
Accept-encoding: gzip headers, check
Content-length headers before downloading contents.
- N-parallel retrieving.
- Depend on Python only. Cross platform.
- Flatten HTML DOM tree to the sequence of paragraphs.
- Apply diff algorithm to detect inserted/deleted paragraphs.
- Filter out irrelevant changes, which uses a linear combination of standard scores of (#[anchored text] / #[whole text]) and (log #[whole text]) per pages ("#[X]" means "the length of X").
- Please write URIs one per line in
- A Web browser will be automatically started on finished. If not, please open a
Yasuhiro Fujii <y-fujii at mimosa-pudica.net>