Time |
Nick |
Message |
06:01 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
08:28 |
|
mantis1 joined #evergreen |
08:31 |
|
rfrasur joined #evergreen |
08:40 |
|
mmorgan joined #evergreen |
09:09 |
|
Dyrcona joined #evergreen |
09:09 |
Dyrcona |
This bit from a long running search query just doesn't look right: btrim(regexp_replace(translate_isbn1013(naco_normalize(split_date_range($_26201$winter$_26201$))) |
09:10 |
Dyrcona |
translate_isbn1013 wrapped around split_date_range. That's gotta be some kind of a search bug. |
09:35 |
Bmagic |
agreed, looks wrong |
09:38 |
Dyrcona |
I'm looking through the apache logs to see if I can find the parameters that caused it. |
09:39 |
|
jvwoolf joined #evergreen |
09:39 |
miker |
that's gonna be normalizers attached to whatever field they were searching. it's config. maybe just a side effect of doing a identifier-class search |
09:39 |
Dyrcona |
Well, this is "new." I was disconnecting a connected USB drive that sometimes errors when being disconnected, and now my trackpad isn't working. Probably gonna have to reboot. |
09:41 |
Dyrcona |
Bluetooth mouse works, so I'm going to see if some usb module crashed. |
09:42 |
Dyrcona |
miker: That "side effect" probably kept the query running for over 14 hours, and it would likely never stop unless stopped externally. |
09:43 |
Dyrcona |
My USB issue looks a bug with LUKS devices. I see it being dismounted in the logs, then udisks tries to access it again. |
09:44 |
miker |
hrm, seems unlikely that those normalizers would do that, since they should only be run once. it's a constant being passed in, and it's in a cte |
09:45 |
Dyrcona |
Well, IDK, but it ran for 14 hours and 20 minutes before I canceled it. After I get my laptop sorted, I'm going to get together the relevant details and open a Lp bug. I'll let those smarter than I figure it out. |
09:45 |
miker |
if you have the whole query, I'd be interested to see it |
09:46 |
Dyrcona |
I can probably get it from the pg logs. Long-running queries are logged. What I got from pg_stat_activity was only part of the query. |
09:47 |
Dyrcona |
I'm just gonna reboot. I see lots of fs and driver errors. I think the drive probably jiggled loose before I tried to put it away. |
09:50 |
|
Dyrcona joined #evergreen |
09:51 |
Dyrcona |
OK. I'm back. Restart didn't work either. I had to power it off with the button half-way through. This Dell has done that from time to time..... |
09:57 |
Dyrcona |
So, the query was for "winter sports." What I'm seeing in the Apache logs looks like it might have succeeded. |
10:02 |
|
terranm joined #evergreen |
10:03 |
Dyrcona |
So, in the postgresql log when there's something like [42-NN] where NN is a series of numbers that increase by 1 that means those log lines are related, right? |
10:04 |
mmorgan |
Dyrcona: That's been my experience |
10:05 |
Dyrcona |
yeahp. so looks like a 99-line query, then. |
10:06 |
Dyrcona |
I think I have to go to the brick head to find the apache request. What I see on the syslog server don't look like the actual searches. |
10:08 |
mmorgan |
Sounds about right for a search query :) |
10:10 |
Dyrcona |
Well, this is problematic. Looks like they tried a few variations. I'll have to see if I can find the one that caused the problem. |
10:11 |
Dyrcona |
mmorgan: Unfortunately, yes. I wrote a 250,000 line (give or take) on Monday using a spreadsheet and a Perl program. I was asked to update about 28,000 copies and call numbers. |
10:12 |
Dyrcona |
Would this part of the query be unique to different variations: $_26201$winter$_26201$ ? |
10:13 |
JBoyer |
Dyrcona, yeah, the numbers are the pid of some process in the call chain. (I forget which) |
10:13 |
Dyrcona |
Well, nice! I don't even find that string in the pg logs on the syslog server. UDP is not being helpful. |
10:14 |
* mmorgan |
thought those numbers were dollar quotes, not pids, which also would not necessarily be unique |
10:15 |
Dyrcona |
Right, the $_2620$ part is not unique. |
10:15 |
JBoyer |
They are dollar quotes, using a process pid to make it incredibly unlikely that any possible search term could accidentally or intentionally allow sql injection. |
10:16 |
Dyrcona |
I had to grep for the _ and number. I think the $ wasn't properly escaped when I tried before. |
10:16 |
mmorgan |
Ah! Interesting! |
10:18 |
Dyrcona |
Yeah, I guess this variation of the query didn't make it to the syslogger.... I can't find it in yesterday's logs even when using fixed strings. |
10:21 |
Dyrcona |
Oh... that shows up in today's logs..... I figured it would be in yesterday's, since it ran for 14+ hours and we log long-running queries after so many minutes. |
10:24 |
Dyrcona |
So, it's 142 lines... |
10:25 |
Dyrcona |
miker: You want me to try and srip the pg log cruft off? |
10:28 |
miker |
(sorry, meeting... feel free to dump the raw log lines if that's easy. I can ctrl-v in vim to cut off chunks) |
10:30 |
Dyrcona |
S'alright, I cleaned it up already. |
10:41 |
Dyrcona |
Whee! Just got a spreadsheet with call number updates in UTF-16 because of accents in the labels. Fortunately, it converts to UTF-8. |
11:06 |
|
collum joined #evergreen |
11:09 |
|
_collum joined #evergreen |
11:09 |
Dyrcona |
OK Back to search bugs... |
11:15 |
Dyrcona |
So, I guess if a query came through websockets, I could see a gateway entry for the query and no apache log entry? |
11:17 |
Dyrcona |
It's in /var/log/apache2/other_vhosts_access.log. (Well, with the .1 since it's from yesterday.) |
11:18 |
Dyrcona |
I don't think that makes it to the syslog server. |
11:20 |
Dyrcona |
It looks like a relatively simple query. |
11:34 |
Dyrcona |
miker: Bug 1960431 |
11:34 |
pinesol |
Launchpad bug 1960431 in Evergreen "Long-running Qurey from relatively simple search" [Undecided,New] https://launchpad.net/bugs/1960431 |
11:46 |
miker |
Dyrcona: do you think you could get an EXPLAIN for that query, in the db where it got stuck? |
11:48 |
miker |
(I have a feeling it's the rare-common-rare term pattern, maybe an intersection with pg10) |
12:11 |
Dyrcona |
miker: I'll get the explain. |
12:21 |
Dyrcona |
miker: https://bugs.launchpad.net/evergreen/+bug/1960431/+attachment/5560027/+files/78.plan |
12:22 |
pinesol |
Launchpad bug 1960431 in Evergreen "Long-running Qurey from relatively simple search" [Undecided,New] |
12:34 |
|
collum joined #evergreen |
12:36 |
|
collum joined #evergreen |
12:52 |
|
jihpringle joined #evergreen |
12:56 |
miker |
Dyrcona: I assume searching for those terms still spins forever? |
13:02 |
miker |
nope, or, not really. it's the exact match part. the "'d words |
13:03 |
Dyrcona |
miker: I haven't tried that search. |
13:03 |
miker |
looks like we should push the exact match construct down into the initial search structures. I wonder if my "make joins CTEs" patch helps here. I'll dig that up. |
13:08 |
Dyrcona |
Did they actually type "winter AND sports" in the search box? Do I understand that corectly? I did a keyword from advanced search with winter in one and sports in the other and it came back rather quickly. |
13:09 |
jeff |
I haven't specifically tested the "make joins CTEs" patch against postgresql < 12 and >= 12, but the change in CTE optimization fence behavior suggests that it might be a good idea (to test/compare). :-) |
13:09 |
Dyrcona |
Looks like they typed "winter" AND "sports" |
13:10 |
Dyrcona |
I hate when they try to outsmart the query engine. They usually do with bad consequences. :) |
13:11 |
Dyrcona |
Yeahp. The variation with the quotes takes longer/hangs. Hasn't been long enough to tell, but the quotes definitely take longer. |
13:14 |
Dyrcona |
Ha! Looks like they tried again. In addition to my query, I see another running from about 10 minutes earlier. |
13:19 |
Dyrcona |
It's definitely the double quotes. |
13:22 |
miker |
Dyrcona: well, I tried it without the and ... sorry :/ |
13:25 |
miker |
anyway, I do think the CTE branch helps, but the exact-match filter would be better right next to the tsquery @@ |
13:27 |
Dyrcona |
miker: Looks like it hangs with or without the AND. I was just trying to reproduce what happened last night. |
13:30 |
Dyrcona |
miker: Which branch is that? When I grep branch names in the working repo, I only find two from dbs that look relevant. |
13:31 |
miker |
it may not be pushed to working :( |
13:32 |
miker |
I'll get (a new version? the first version?) of it out soon and point you at it. I'd have sworn it was floating around, though... |
13:35 |
miker |
yeah, dbs' was the first step, getting the tsquery construction out of the main select. I do the same to the LEFT JOINs in the main select that represent a search based on each of the dbs CTEs |
13:36 |
miker |
but I don't see the branch "out there" ... going to search harder, and push one failing that |
13:40 |
miker |
go git, go! |
13:44 |
|
jvwoolf joined #evergreen |
13:49 |
Dyrcona |
:) |
14:59 |
|
jvwoolf joined #evergreen |
16:02 |
|
jihpringle joined #evergreen |
17:03 |
|
jvwoolf left #evergreen |
17:10 |
|
mmorgan left #evergreen |
18:01 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |