Time |
Nick |
Message |
05:01 |
pinesol_green |
News from qatests: Test Success <http://testing.evergreen-ils.org/~live> |
07:05 |
|
NawJo joined #evergreen |
07:22 |
|
rjackson_isl joined #evergreen |
07:54 |
|
collum joined #evergreen |
08:15 |
|
JBoyer joined #evergreen |
08:27 |
|
agoben joined #evergreen |
08:39 |
|
mmorgan joined #evergreen |
08:39 |
|
NawJo joined #evergreen |
08:53 |
|
bos20k joined #evergreen |
09:06 |
|
Dyrcona joined #evergreen |
09:09 |
|
bos20k joined #evergreen |
09:20 |
|
maryj joined #evergreen |
09:38 |
|
kmlussier joined #evergreen |
09:51 |
* Dyrcona |
thinks his initial assessment of a "bug" might be wrong. |
10:00 |
Dyrcona |
Just doing this: "select distinct * from biblio.record_entry where not deleted" and print the marc field from the result leads to memory starvation. |
10:01 |
Dyrcona |
That is more or less the query if you do marc_export -a with no other selection options. |
10:02 |
Dyrcona |
I don't believe the starvation happened on Wheezy or Trusty. I can build a vm on one of those later to test it. |
10:02 |
|
mmorgan1 joined #evergreen |
10:06 |
jeff |
Dyrcona: in your test case above, you're still using perl + dbi, or do you have that issue even executing the query via psql? |
10:06 |
Dyrcona |
jeff: That is Perl + DBI. |
10:06 |
jeff |
Dyrcona: and, which distros have you encountered the issue on? |
10:06 |
Dyrcona |
This is Debian 8 Jessie. I've seen similar on Ubuntu 16.04, but has not tested this specific script. |
10:07 |
Dyrcona |
I'm doing it without the distinct to see if that makes a difference. |
10:07 |
Dyrcona |
Also, it's important to note that we have approximately 2.7 million bibs that are not deleted. |
10:07 |
* jeff |
nods |
10:07 |
jeff |
scale matters when trying to reproduce memory exhaustion bugs. :-) |
10:08 |
Dyrcona |
Yes. |
10:08 |
jeff |
unable to reproduce with an empty database, marking as INVALID! |
10:08 |
Dyrcona |
I can easily dump 50,000 at a time. |
10:08 |
* jeff |
ducks |
10:08 |
Dyrcona |
:) |
10:10 |
Dyrcona |
It looks like it is happening without the distinct, but the script hasn't been killed yet. |
10:10 |
Dyrcona |
Almost 5GB of RAM are in use. |
10:11 |
Dyrcona |
Now, swap used is increasing, I expect OOM killer any minute, now. |
10:12 |
Dyrcona |
I wonder if DBI has some new caching options/behaviors.... |
10:15 |
Dyrcona |
Yeah. It dies without the distinct and I get no output. |
10:17 |
|
kmlussier joined #evergreen |
10:18 |
jeff |
Dyrcona: and you're not using a "fetchall" DBI call, but "fetchrow"? |
10:19 |
Dyrcona |
Yes, execute then fetchrow in a loop. This is more or less what marc_export does. |
10:19 |
Dyrcona |
fetchrow_hashref to be precise. |
10:19 |
jeff |
do you know (with debugger or print/warn statements, etc) if it dies before the execute call completes? |
10:21 |
Dyrcona |
No, I don't know that. |
10:21 |
Dyrcona |
I'm kind of getting pulled in different directions at the moment. ;) |
10:23 |
* jeff |
nods |
10:25 |
Dyrcona |
jeff: I know this, I get no output in my file. So it doesn't fill an output buffer for Perl. |
10:25 |
Dyrcona |
I can try without the > filename and see if I get any output on the screen. I'll have to try the debugger later. I always have to brush up on the commands when I use it. |
10:26 |
Dyrcona |
And, the DBI docs only talk about cached connection handles AFAICT, but again, I'm doing too many things at the same time. |
10:27 |
Dyrcona |
Well, statement handles can be cached, too. That's nothing new. |
10:33 |
Dyrcona |
jeff: I'll fiddle with RowCacheSize in a bit. |
10:36 |
jeff |
Dyrcona: when you get to that point, DBD::Pg states: |
10:36 |
jeff |
RowCacheSize |
10:36 |
jeff |
Not used by DBD::Pg |
10:37 |
Dyrcona |
OK. |
10:38 |
|
mmorgan joined #evergreen |
10:38 |
jeff |
ah, here: |
10:38 |
jeff |
Cursors |
10:38 |
jeff |
Although PostgreSQL supports cursors, they have not been used in the current implementation. When DBD::Pg was created, cursors in PostgreSQL could only be used inside a transaction block. Because only one transaction block at a time is allowed, this would have implied the restriction not to use any nested SELECT statements. Therefore the "execute" method fetches all data at once into data structures located in the front-end application. This fact must to |
10:39 |
jeff |
truncated, ends as: |
10:39 |
jeff |
This fact must to be considered when selecting large amounts of data! |
10:41 |
jeff |
so, solution is either to LIMIT X OFFSET y, or declare a cursor as shown in the DBD::Pg docs. |
10:41 |
Dyrcona |
Despite all of that, something has changed since Perl 5.14, because a very similar script used to work there. |
10:41 |
jeff |
but internally, it (DBD::Pg) doesn't use cursors |
10:41 |
Dyrcona |
Yes, understood. And cursors are so 20th century.... :) |
10:43 |
Dyrcona |
I don't think it's a question of the amount of RAM changing. I believe the server that I used to run this on had 8GB, and I've run it on a vm with 8GB with similar results. |
10:43 |
Dyrcona |
Similar results meaning it crashes. |
10:43 |
jeff |
Yeah, I was about to suggest re-testing to see if you can reproduce on older distro with this same dataset. |
10:44 |
jeff |
But at this point, that's probably mostly to satisfy curiosity. |
10:44 |
jeff |
grabbing a configurable chunk of bibs at a time will probably be the fix. |
10:44 |
jeff |
default to 10k or 50k or whatever testing shows to use a semi-reasonable amount of ram. |
10:45 |
jeff |
which version of perl are you running on the problematic system? |
10:46 |
|
Christineb joined #evergreen |
10:47 |
jeff |
oh. jessie, therefore 5.20.2 |
10:59 |
Dyrcona |
Xenial is 5.22.something |
11:00 |
jeff |
https://rt.cpan.org/Public/Bug/Display.html?id=93266 is the DBD::Pg bug for fixing this long-term, which also isn't immediately useful. |
11:06 |
Dyrcona |
Right. |
11:06 |
Dyrcona |
I'm going to build some 8GB VMS with different distros: Wheezy, Jessie, Trusty, and Xenial to test this. |
11:07 |
Dyrcona |
Mabye not today, but soon. |
11:07 |
Dyrcona |
I may have a non-issue. :) |
11:20 |
|
fbeaudry joined #evergreen |
11:31 |
|
khuckins__ joined #evergreen |
11:32 |
graced |
Reminder: it is the last day to vote in the EOB elections. Voting ends at midnight! |
12:06 |
kmlussier |
graced: Is that reminder going out to the lists too? |
12:07 |
kmlussier |
Or maybe note the lists, but to the registered voters. |
12:08 |
graced |
kmlussier: I sent a reminder from the voting platform Friday to the voters who hadn't voted yet |
12:08 |
graced |
It won't allow me to send another |
12:08 |
kmlussier |
Ah, ok. I wouldn't have seen that because I had already voted. graced++ |
12:09 |
graced |
kmlussier++ #voting early |
12:10 |
kmlussier |
graced: Does voting Thursday afternoon after the EOB reminder qualify as early? |
12:12 |
graced |
if you made it in before the reminder email I think it qualifies |
12:30 |
* dbs |
registered late, but voted early |
12:37 |
kmlussier |
dbs++ |
12:40 |
|
jihpringle joined #evergreen |
12:51 |
|
jihpringle joined #evergreen |
13:02 |
|
rlefaive_ joined #evergreen |
15:04 |
|
mmorgan1 joined #evergreen |
15:43 |
|
Jillianne joined #evergreen |
15:50 |
|
mmorgan joined #evergreen |
16:14 |
|
gsams_ joined #evergreen |
16:19 |
|
eady joined #evergreen |
16:20 |
|
rlefaive joined #evergreen |
16:29 |
|
kmlussier joined #evergreen |
17:01 |
pinesol_green |
News from qatests: Test Success <http://testing.evergreen-ils.org/~live> |
17:05 |
|
mmorgan left #evergreen |
23:20 |
|
genpaku joined #evergreen |