Time |
Nick |
Message |
06:59 |
|
cbrown joined #evergreen |
08:02 |
|
BDorsey joined #evergreen |
08:04 |
|
ianskelskey joined #evergreen |
08:04 |
|
ianskelskey left #evergreen |
08:43 |
|
mmorgan joined #evergreen |
08:46 |
|
ianskelskey joined #evergreen |
08:52 |
|
kworstell-isl joined #evergreen |
08:54 |
|
dguarrac joined #evergreen |
09:06 |
|
mdriscoll joined #evergreen |
09:08 |
|
Dyrcona joined #evergreen |
09:11 |
|
kworstell-isl joined #evergreen |
09:34 |
|
mdriscoll joined #evergreen |
10:58 |
|
Christineb joined #evergreen |
11:03 |
* Dyrcona |
wonders if doing the translations bits for different releases on the same VM will cause problems. Probably not. |
11:06 |
* Dyrcona |
does release building on a dedicated vm, then tests it elsewhere. |
11:12 |
jeffdavis |
We've got a recurring situation where ill-behaved webcrawlers doing ~200 qtype=subject queries per minute can cause OOM problems. |
11:12 |
jeffdavis |
We're playing whack-a-mole with IP addresses since the crawlers don't properly identify themselves. |
11:12 |
jeffdavis |
It would be nice if EG handled that volume of requests more gracefully. |
11:24 |
|
ianskelskey joined #evergreen |
11:42 |
|
ianskelskey joined #evergreen |
12:00 |
|
smayo joined #evergreen |
12:05 |
Dyrcona |
jeffdavis: If the IPs are relatively consistent, you can block them in nginx or a firewall. We outright block China and some other countries. |
13:01 |
|
mdriscoll joined #evergreen |
13:01 |
Bmagic |
jeffdavis: I had that on the hackaway agenda actually. We never talked about it though: https://wiki.evergreen-ils.org/doku.php?id=hack-a-way:hack-a-way-2024 LP 1913617 LP 1361782 |
13:01 |
pinesol |
Launchpad bug 1913617 in OpenSRF "NGINX could use a DOS mitigation example" [Undecided,New] https://launchpad.net/bugs/1913617 |
13:01 |
pinesol |
Launchpad bug 1361782 in Evergreen 3.9 "Evergreen Denial of Service easily accomplished" [Medium,Fix released] https://launchpad.net/bugs/1361782 |
13:02 |
Bmagic |
jeffdavis: instituting the rate limiting is another idea that we've explored for specific URL patterns. Example included in that "DOS mitigation example" bug |
13:04 |
Bmagic |
and yes, I agree that Evergreen itself should handle the requests more gracefully. Evergreen actually does have limit mitigation for high volumes of requests from the same IP, but not from multiple IP's. We're seeing patterns of 12k IP addresses acting together, it's annoying |
13:09 |
jeffdavis |
yeah that's what we're seeing here, less than a handful of requests per IP but a bunch of different IPs (currently 862 in my blocklist) |
13:10 |
jeffdavis |
with the subject searches in particular it seems like they start to be a problem when we see >60/minute; most of the time, traffic for those queries maxes out at 30-40/min |
13:11 |
jeffdavis |
it's a lot of queries in a way, but not enough that you'd expect them to take down a search system |
13:12 |
Bmagic |
yep, and weird stuff in the search string, that causes PG to take a long time, only to get zero results, then the symspell attempt takes another 30 seconds, and returns nothing. |
13:14 |
Bmagic |
so, as a short-gap measure, I've truncated the symspell table to cut the damge in half right away, then tried to analyze the traffic and come up with a way to block them. I ended up blocking the whole class B (in some cases) |
13:15 |
Bmagic |
I was working on a document to present to the hackaway: https://docs.google.com/document/d/1zG5ZzB5y4U8BOKdEuRrxNFbaPd3o8em7EkpaAQA5WgE/edit?usp=sharing |
13:21 |
Bmagic |
jeffdavis: Another thing we're rolling out is: GeoIP. A NGINX plugin combined with cron to get a fresh list of IP/country and blocking based on country. Using this as a guide basically: https://shashanksrivastava.medium.com/block-a-website-in-specific-countries-using-nginx-20a651288795 |
13:23 |
jeffdavis |
All very useful stuff, thanks! I've been holding out hope that we can block Huawei's datacentre in Singapore without blocking the entire country, but maybe we can't. :( |
13:38 |
|
ianskelskey joined #evergreen |
14:46 |
|
ianskelskey joined #evergreen |
15:35 |
Bmagic |
lol, dojo finally removed the tarball that we've been relying on: wget http://download.dojotoolkit.org/release-1.3.3/dojo-release-1.3.3.tar.gz |
15:36 |
Bmagic |
geez |
15:36 |
Bmagic |
I was wondering when that was going to happen |
15:36 |
Bmagic |
so.... we host it? |
15:37 |
Bmagic |
we're already hosting it for the make build process... I suppose we just fix our install instructions to point to that |
15:38 |
Dyrcona |
Yeah that should work. |
15:40 |
Dyrcona |
I get an incomplete download file. |
15:40 |
Dyrcona |
That Dojo that we're hosting has some custom dojo files in it, IIRC. |
15:43 |
Dyrcona |
Interesting... The incomplete file that I downloaded has the same checksum as dojo-release-1.3.3.tar.gz that I had downloaded previously, so maybe the download wasn't actually removed and Chrome is just being weird about it? |
15:45 |
Dyrcona |
So, yeah, the download works for me in Firefox. |
15:45 |
Bmagic |
which one? from the dojo server? |
15:45 |
Dyrcona |
Yeah. |
15:45 |
Dyrcona |
The link you pasted. |
15:46 |
csharp_ |
Bmagic: https |
15:46 |
csharp_ |
I found that same issue when I was tinkering with berick's ansible stuff |
15:46 |
Bmagic |
!!! https indeed |
15:46 |
Bmagic |
lol, I think berick copied off of my old ansible when he built his |
15:46 |
csharp_ |
Bmagic++ |
15:47 |
Bmagic |
and to this day, I'm still using http |
15:47 |
Bmagic |
hilarious |
15:47 |
Dyrcona |
We could change it to curl from wget also. |
15:48 |
Dyrcona |
When I downloaded with Chrome, it named the file 'Unconfirmed 618535.crdownload' but it has the same size and checksum as the tarball. |
15:49 |
csharp_ |
@band add Unconfirmed Crdownload |
15:49 |
pinesol |
csharp_: Band 'Unconfirmed Crdownload' added to list |
15:50 |
Bmagic |
lol |
15:52 |
Dyrcona |
I think we can just update the README without much formality. |
15:55 |
Bmagic |
I agree |
15:55 |
Dyrcona |
I'm doing it right now. |
15:56 |
Bmagic |
Dyrcona++ |
15:56 |
Dyrcona |
Just checking that https really works with wget and it does. |
15:56 |
Bmagic |
yeah, probably just put the s in there |
15:56 |
Bmagic |
a one-character commit |
15:56 |
csharp_ |
just sneak it on in |
15:56 |
Dyrcona |
Yeah. |
15:57 |
Dyrcona |
I credit both of you in the commit message and link to the IRC log. |
15:59 |
Bmagic |
I think it's hilarious that we're downloading that file. I bet their apache logs indicate a huge popularity on that file. "Huh, we must have really made a good release when we made 1.3.3" "It's downloaded 100 times more often than anything else on our website" |
15:59 |
Dyrcona |
:) |
15:59 |
Dyrcona |
I did get a 404 from wget when trying http, but https still works. |
16:00 |
Dyrcona |
If we ever need to host it ourselves, we have the custom dojo.tgz file and we can just throw a copy of dojo-release-1.3.3.tar.gz on the server. |
16:00 |
pinesol |
News from commits: Update Dojo download URL in README/install doc <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=eb6545b9de257f29af3280ee278d25138039c1e3> |
16:00 |
Dyrcona |
dojo.tgz doesn't require the copy or move of files if expanded correctly. |
16:00 |
Bmagic |
It's baked into so many of my automatic builds, and those builds have found their way around to people out there, as well as berick's ansible, it's gotta be getting downloaded at least 5 times a day, maybe 10 |
16:01 |
Dyrcona |
Could be. I keep a copy around in the set of files that I put on my new vms. Actually, I have a copy of both dojo archives and usually install ours. |
16:02 |
Dyrcona |
Maybe all those downloads are why they're blocking it for http? |
16:02 |
Bmagic |
probably, or they finally upgraded their web server to current standards |
16:03 |
Dyrcona |
They're probably using a CDN. |
16:04 |
Dyrcona |
dojo was very popular for a while. Dunno how much it is used any more, but it seems to be hanging in there. |
16:04 |
Bmagic |
apparently. Someone is still paying for that site to be online |
16:05 |
Bmagic |
it gets redirected to: https://dstorejs.io/ |
16:06 |
Bmagic |
well, the download subdomain does at least. https://dojotoolkit.org/ is still up |
16:30 |
pinesol |
News from commits: LP#2084837: Show operators for resolved datatype <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=d6feb7402835fbc09ece5db8f6f241ffa09fcd9c> |
16:48 |
mmorgan |
Dyrcona: Thanks for finding and resolving my omission! |
16:48 |
mmorgan |
Dyrcona++ |
16:49 |
Dyrcona |
mmorgan++ No problem. |
17:04 |
|
mmorgan left #evergreen |