Commit Graph

229 Commits

Author SHA1 Message Date
io
04178b37d7 fix scraping posts
saves the cursors provided in the first page to the db so that we can reuse it next time we fetch,
instead of assuming the format of the cursor URL manually using min_id
2021-06-14 21:39:17 +00:00
io
fe1474ffd0 call raise_for_status() on all GET requests 2021-06-14 20:34:33 +00:00
io
a46d7fe95c SQL NULL a fuck 2021-06-11 21:37:09 +00:00
io
71dbf59796 add ability to ignore CWs 2021-06-11 21:29:51 +00:00
Agatha Rose
a904587b32
Clean up formatting and help linter calm down 2021-06-05 00:38:36 +03:00
Agatha Rose
dd78364f2d
Expose overlap ratio and length limit to config 2021-06-05 00:14:56 +03:00
Agatha Rose
54563726b2
Add testing virtual env to .gitignore 2021-06-04 23:57:40 +03:00
Agatha Rose
63161444a9
Merge pull request #1 from otrapersona/dedup_trigger
Add trigger to remove duplicate posts on db
2021-06-04 22:58:42 +03:00
otrapersona
be8227c70a Changed group of trigger
I think there's a tiny chance that two posts on diff instances have the same id, problem solved by using the uri.
2021-03-13 13:54:32 -06:00
otrapersona
9f80c2746f Add trigger
Fixes symptom but not cause 🤷‍♀️
2021-03-13 13:46:18 -06:00
Agatha Rose
27f61c4374
Make bs4 only replace the tag name instead of name and contents 2021-02-18 18:01:43 +02:00
dependabot-preview[bot]
d07d49d42e
Merge pull request #43 from Lynnesbian/dependabot/pip/markovify-0.8.2 2020-08-02 05:04:11 +00:00
dependabot-preview[bot]
82943a1303
Bump markovify from 0.8.0 to 0.8.2
Bumps [markovify](https://github.com/jsvine/markovify) from 0.8.0 to 0.8.2.
- [Release notes](https://github.com/jsvine/markovify/releases)
- [Commits](https://github.com/jsvine/markovify/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-08-02 05:03:05 +00:00
Lynne
09a1efc30a
Merge pull request #42 from Lynnesbian/dependabot/pip/requests-2.24.0
Bump requests from 2.23.0 to 2.24.0
2020-08-02 15:02:01 +10:00
Lynne
2baf060a08
Merge branch 'master' into dependabot/pip/requests-2.24.0 2020-08-02 15:01:49 +10:00
Lynne
64079a96cb
removed patreon 2020-08-02 14:54:22 +10:00
dependabot-preview[bot]
74046032aa
Bump requests from 2.23.0 to 2.24.0
Bumps [requests](https://github.com/psf/requests) from 2.23.0 to 2.24.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/master/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.23.0...v2.24.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-08-02 04:25:23 +00:00
Lynne
bb39af52a9
Merge pull request #41 from Lynnesbian/dependabot/pip/beautifulsoup4-4.9.1
Bump beautifulsoup4 from 4.8.2 to 4.9.1
2020-08-02 14:24:33 +10:00
Lynne
96c047a40b
Merge pull request #39 from Lynnesbian/dependabot/pip/mastodon-py-1.5.1
Bump mastodon-py from 1.5.0 to 1.5.1
2020-08-02 14:24:14 +10:00
Lynne
8274409bf4
update extract code to match fedibooks 2020-05-27 22:31:16 +10:00
dependabot-preview[bot]
7b2fe14ba5
Bump beautifulsoup4 from 4.8.2 to 4.9.1
Bumps [beautifulsoup4](http://www.crummy.com/software/BeautifulSoup/bs4/) from 4.8.2 to 4.9.1.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-18 19:13:06 +00:00
dependabot-preview[bot]
a5fd049309
Bump mastodon-py from 1.5.0 to 1.5.1
Bumps [mastodon-py](https://github.com/halcy/Mastodon.py) from 1.5.0 to 1.5.1.
- [Release notes](https://github.com/halcy/Mastodon.py/releases)
- [Changelog](https://github.com/halcy/Mastodon.py/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/halcy/Mastodon.py/compare/1.5.0...1.5.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-16 19:16:47 +00:00
Lynnesbian
2321f75e11
added note about contacting me to help with the docker stuff 2020-03-11 18:14:59 +10:00
Lynnesbian
523657f8ba
move secure fetch stuff to a wiki page, refine some info 2020-03-11 18:12:34 +10:00
Lynnesbian
0845ca17d5
added a bunch of info about secure fetch, donations, and compatibility 2020-03-11 18:04:51 +10:00
Lynnesbian
04520a57ef
added a note about the docker version 2020-03-11 17:37:18 +10:00
Lynnesbian
44037919b8
added a note about fedibooks 2020-03-10 17:24:07 +10:00
Lynnesbian
19c7795772
added a note on why diaspora* won't work 2020-03-10 17:17:35 +10:00
Lynnesbian
68b7ac7c3b
improved readme with more info about some common issues 2020-03-10 17:15:23 +10:00
Lynnesbian
5534f99fc7
fixed a pleroma bug that's been around for ages but apparently nobody noticed whoops 2020-03-10 17:12:23 +10:00
Lynnesbian
c3b9d91ce7
fix another dumb mistake that broke pleroma 2020-03-10 16:59:29 +10:00
Lynnesbian
ac411e15a9
print warning after printing cfg file location 2020-03-10 16:54:00 +10:00
Lynnesbian
8db84a5656
check to see whether site starts with http(s):// 2020-03-10 16:53:30 +10:00
Lynne
dbd74ed6fe
handle case with a single json page better 2020-03-10 14:35:12 +10:00
Lynne
eea48dda1c
fixed an error causing pleroma to always fail 2020-03-10 14:33:26 +10:00
Lynnesbian
1c8b86543b
Merge branch 'master' of https://github.com/Lynnesbian/mstdn-ebooks 2020-03-08 19:57:42 +10:00
Lynnesbian
fbde3cb911
only use pleroma mode if 'prev' key exists, handle final page better 2020-03-08 19:57:06 +10:00
Lynne
3d869a38c1
Merge pull request #37 from Lynnesbian/dependabot/pip/beautifulsoup4-4.8.2
Bump beautifulsoup4 from 4.8.1 to 4.8.2
2020-03-08 19:54:18 +10:00
dependabot-preview[bot]
0d74362d4d
Bump beautifulsoup4 from 4.8.1 to 4.8.2
Bumps [beautifulsoup4](http://www.crummy.com/software/BeautifulSoup/bs4/) from 4.8.1 to 4.8.2.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-08 09:54:12 +00:00
Lynne
1512f8b8c2
Merge pull request #38 from Lynnesbian/dependabot/pip/requests-2.23.0
Bump requests from 2.22.0 to 2.23.0
2020-03-08 19:53:17 +10:00
Lynne
2ee9e18491
Merge pull request #36 from Lynnesbian/dependabot/pip/markovify-0.8.0
Bump markovify from 0.7.2 to 0.8.0
2020-03-08 19:53:04 +10:00
Lynnesbian
7d718bbe3a
minor code cleanup 2020-03-08 19:46:07 +10:00
dependabot-preview[bot]
4d4ea4228b
Bump requests from 2.22.0 to 2.23.0
Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.23.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/master/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.22.0...v2.23.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-19 19:34:12 +00:00
dependabot-preview[bot]
e685caf48d
Bump markovify from 0.7.2 to 0.8.0
Bumps [markovify](https://github.com/jsvine/markovify) from 0.7.2 to 0.8.0.
- [Release notes](https://github.com/jsvine/markovify/releases)
- [Commits](https://github.com/jsvine/markovify/compare/v0.7.2...v0.8.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-12-18 19:59:37 +00:00
Lynne
d698cd445a
Merge pull request #34 from Lynnesbian/dependabot/pip/mastodon-py-1.5.0
Bump mastodon-py from 1.4.6 to 1.5.0
2019-11-24 18:31:45 +10:00
Lynne
56379175e0
Merge pull request #33 from Lynnesbian/dependabot/pip/beautifulsoup4-4.8.1
Bump beautifulsoup4 from 4.8.0 to 4.8.1
2019-11-24 18:31:31 +10:00
Lynne
24de5ac4a5
Merge pull request #35 from kanelillym/patch-1
Typo fix in README.md
2019-11-24 18:31:18 +10:00
Lilly Kane
56d16eb4f2
Typo fix in README.md
unecessarily > unnecessarily
2019-10-22 08:49:08 -07:00
dependabot-preview[bot]
f2b1cb6e00
Bump mastodon-py from 1.4.6 to 1.5.0
Bumps [mastodon-py](https://github.com/halcy/Mastodon.py) from 1.4.6 to 1.5.0.
- [Release notes](https://github.com/halcy/Mastodon.py/releases)
- [Changelog](https://github.com/halcy/Mastodon.py/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/halcy/Mastodon.py/compare/1.4.6...1.5.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-10-15 05:07:37 +00:00
dependabot-preview[bot]
91f4c805f7
Bump beautifulsoup4 from 4.8.0 to 4.8.1
Bumps [beautifulsoup4](http://www.crummy.com/software/BeautifulSoup/bs4/) from 4.8.0 to 4.8.1.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-10-08 01:51:31 +00:00