Evergreen ILS Website

IRC log for #evergreen, 2015-01-26

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
01:43 RBecker joined #evergreen
06:07 temur joined #evergreen
06:28 Temur joined #evergreen
06:32 Temur joined #evergreen
07:00 jboyer-isl joined #evergreen
07:32 graced joined #evergreen
07:49 BigRig joined #evergreen
07:52 jeff joined #evergreen
07:52 jeff joined #evergreen
07:53 mtate joined #evergreen
07:55 julialima_ joined #evergreen
08:00 pmurray joined #evergreen
08:02 rjackson-isl joined #evergreen
08:07 collum joined #evergreen
08:23 akilsdonk joined #evergreen
08:30 mdriscoll1 joined #evergreen
08:40 Shae joined #evergreen
08:40 StomproJ joined #evergreen
08:40 Dyrcona joined #evergreen
08:42 Stompro joined #evergreen
08:43 mmorgan joined #evergreen
08:43 abowling joined #evergreen
08:44 Dyrcona Good morning, #evergreen!
08:46 mmorgan Good Morning!
08:59 yboston joined #evergreen
09:03 mrpeters joined #evergreen
09:13 Dyrcona Blizzard warning and for some reason, I'm stoked.
09:18 Dyrcona I'm not usually a fan of snow, but I'm kind of looking forward to it.
09:18 Dyrcona The snow we got this weekend was so pretty.
09:19 * mmorgan hears that the coming snow will be a lot *prettier* ;-)
09:19 dbs moin
09:20 Dyrcona mmorgan++
09:20 Dyrcona dbs is no stranger to snow, eh?
09:20 * Dyrcona works on a scrip to shutdown webrick.
09:20 Dyrcona a script even.
09:22 dbs Dyrcona: true true! but haven't had a 2-3 foot dump for quite some time. been skating on the lake and nordic skiing for the past month though; loving it
09:29 Dyrcona Sounds fun! I've never been skiing, and the rivers don't freeze here like they used to.
09:30 RoganH joined #evergreen
09:30 Stompro Anyone used pg barman with your EG database?  http://www.pgbarman.org/
09:30 Dyrcona And a script to stop webrick is only 7 lines long, including two blank lines and a sanity check that ps -C ruby -o pid= actually found a running process.
09:36 Dyrcona As tsbere points out to me in a private chat, `ps -C ruby -o pid,args | grep edi_webrick | cut -f1 -d\ ` is safer, in case we run more than 1 ruby process.
09:38 Dyrcona No, it's not for an init script.
09:47 rjackson-isl joined #evergreen
10:26 Dyrcona This is the theme song for our Evergreen server this morning: https://www.youtube.com/watch?v=a1IyrVeX0W8
10:27 Dyrcona It keeps hitting 200 cstores and then backing down.
10:31 dbs Argh. That's so frustrating :(
10:33 berick Dyrcona:  have you applied any patches?
10:33 Dyrcona berick: Yes.
10:34 Dyrcona When we go down again, we're going apply all of the latest O/S updated and force reinstall/update all CPAN packages as well.
10:34 berick yowza
10:34 Dyrcona Our current thinking is the problem lies with the packages.
10:35 Dyrcona Something from CPAN could be out of sync with a C library that was updated.
10:35 Dyrcona I actually want it to go down, today.
10:36 Dyrcona But, now we're back down to 83 cstore drones total across two servers.
10:36 Dyrcona We're going to get new hardware soon, and we'll configure things very differently.
10:38 Dyrcona Thing is, when the load hits 200, you'd expect to see 16 or so processes in top each 100% or more of CPU.
10:38 Dyrcona That doesn't happen.
10:38 berick Dyrcona: would it possible to share your log files?
10:38 berick well, they donres are probably sitting there waiting for the processes that connected to them to do soemthing
10:39 berick but said process has probably forgotton abou tthem
10:39 Dyrcona berick: I could possibly share them.
10:39 berick so they are just hanging out waitin to time out
10:39 Dyrcona Well, I don't think CStoreEditor is the problem, though we should still probably clean up the code.
10:40 Dyrcona I mentioned last week that I added a warn to the destructor and did some tests on my dev machine and every other line (not quite) in the logs was the destructor being called.
10:41 berick that's great, but doesn't help when the destructor isn't called, like when refs are still in scope
10:42 Dyrcona And now, we're down to 41 cstore drones.
10:42 Dyrcona Thing is, I haven't found anywhere the refs look like they would still be in scope.
10:42 Dyrcona If I do, I'll put that line back in and see if I can trigger that code.
10:42 berick Dyrcona: if you could share the osrfsys and activity logs collected during one of the explosions, that would be great
10:59 akilsdonk joined #evergreen
11:01 mrpeters joined #evergreen
11:12 bshum Ugh, more reingesting.
11:12 bshum gmcharlt++ # patch for LP 1414112
11:12 pinesol_green Launchpad bug 1414112 in Evergreen 2.7 "Audience filter no longer searching blank spaces" (affected: 2, heat: 12) [Medium,Confirmed] https://launchpad.net/bugs/1414112
11:13 goood bshum: we can add audience without a reingest, don't worry.  I'll whip up an update statement
11:13 goood and also,
11:13 goood gmcharlt++
11:25 goood ah, we'll need one more change for that... I'll push a collab branch
11:38 RoganH joined #evergreen
11:47 vlewis joined #evergreen
11:48 sandbergja joined #evergreen
11:49 vlewis_ joined #evergreen
11:54 dreuther joined #evergreen
12:33 buzzy joined #evergreen
12:49 jihpringle joined #evergreen
13:27 gdunbar joined #evergreen
13:34 Bmagic_ Has anyone heard of OpenSRF performing 4 lookups per hold during item checkin? My logs show exactly 4 lookups for every potiential hold on an item during a checkin. It happens via SIP and via staff client (provided that the checkin_lib != call_number.owning_lib)
13:36 jeff i have seen similar, yes.
13:37 jeff it can lead to timeouts in scenarios where there are many (unsuitable due to age protection/etc) holds on the record for a copy.
13:38 bshum Indeed.
13:38 jeff i thought that in some very preliminary digging, signs had pointed to it being exacerbated by "clear shelf expired holds", but of course that wouldn't apply for SIP (or maybe it could be made to in some environments...)
13:39 bshum For us i think it was dealing with large numbers of holds to check through.  Opportunistic capture I think.
13:40 bshum So the more holds there were, the longer it took in general to check in.
13:41 Bmagic_ bshum: jeff: yes, however, even if there is only one hold, it still looks 4 times but you don't notice it because it doesn't take very long
13:41 * jeff nods
13:41 bshum Bmagic_: That doesn't surprise me.  The checking code is weird....
13:42 Bmagic_ interestingly enough, it only checks once if you are checking the item in at it's home library
13:44 Bmagic_ I have done the work to isolate the cases, and I have log files for each senario, is this a bug?
13:45 jeff i recommend you open a bug (after doing your own check to see if there's one already, but I don't think there is)
13:46 jeff i'll try to give it some eyes also, since we've observed similar.
13:47 Bmagic_ jeff: ok, I will check that out. Not sure how to phrase my search terms....
13:47 * jeff nods
13:49 Bmagic_ jeff: I'm creating a new bug
13:51 tsbere jeff: I will point out that "clear shelf expired holds" should only add calls for copies that are flagged as on the hold shelf, so any other situation shouldn't do much more there.
13:51 berick Bmagic_: can you clarify what you mean by "it looks 4 times"
13:52 jeff tsbere: pretty sure what i saw agrees with your statement, yes.
13:52 Bmagic_ berick: grep "checking if hold" osrfsys.log  yeilds  [INFO:16784:Holds.pm:3144:1422189177127601] circulator: checking if hold 216012 is permitted for copy PPL60121 x4
13:53 tsbere Bmagic_: Some of those may actually end up being dupes (different pieces logging the same call) though I can't be certain without looking closer.
13:54 Bmagic_ tsbere: weird that it only looks once when checkin occurs at circ_lib
13:54 Bmagic_ s/looks/logs/
13:54 * Dyrcona hasn't been this excited about a snow day since he was in primary school.
13:55 berick Bmagic_: and it's checking the same hold ID with every call?
13:55 Bmagic_ berick: indeed
13:56 Bmagic_ 4 times
13:56 Bmagic_ always exactly 4
13:56 Dyrcona Bmagic_: It's being thorough, like checking fines twice on a renewal.
13:56 Bmagic_ Dyrcona: lol no doubt
13:57 Bmagic_ It's an issue when there are many potential holds. It can take so long in fact, the SIP client will time out. I have logs from SIP that log the response time, clocked at >15 seconds
13:58 berick many potential holds doesn't explain why it would check the same hold more than once
13:58 berick Bmagic_: can you tell if open-ils.storage.action.hold​_request.nearest_hold.atomic is called once or multiple times?
13:59 Bmagic_ berick: I am not saying that the cause is due to many holds. I think the cause is due to item checkin_lib != item owning_lib
13:59 Bmagic_ berick: because it only checks the potential hold once if you are checking the item in at it's circ_lib/owning_lib
14:00 tsbere Bmagic_: Are you using the "if things haven't gone home for interval whatever send them home" features?
14:00 Bmagic_ http://pastebin.com/snAW2WkF
14:02 Bmagic_ tsbere: "Max foreign-circulation time" ?
14:04 Bmagic_ berick:  open-ils.storage.action.hold​_request.nearest_hold.atomic is called once
14:06 berick yeah...
14:06 berick that suggests that open-ils.storage.action.hold_request.nearest_hold needs to be unique-ifying its return list
14:06 Bmagic_ afk
14:06 bshum Interesting...
14:08 RoganH un_helpful advice: wipe your server, start over
14:12 berick i wonder if "SELECT DISTINCT(h.id)" in action:nearest_hold would do it
14:17 tsbere berick: Just adding DISTINCT before the existing h.id should work (distinct the entire column list....which is one column)
14:20 berick tsbere: yeah, that's what I meant.  just insert DISTINCT
14:21 * berick wonders if Bmagic_ is up for an experiment
14:22 tsbere Bmagic_: Out of curiosity, does this query result in anything for you?   SELECT count(id), from_org, to_org FROM actor.org_unit_proximity GROUP BY from_org, to_org HAVING count(id) > 1;
14:23 Bmagic_ I'll check when I get back. Bout 45 minutes
14:24 * dbs gets zero rows for that
14:24 tsbere dbs: As far as I know you should get zero rows. Uniqeness of from/to org combinations is not, however, guaranteed at the DB level...
14:25 * dbs wipes brow in relief
14:25 tsbere Which could result in extra rows when looking up holds, which would result in the need for said DISTINCT addition and the extra lookups.
15:00 kbutler joined #evergreen
15:15 Bmagic_ tsbere: OMG, that query returns 5929 rows, with all counts=4!
15:16 berick tsbere++
15:17 tsbere Bmagic_: You need to fix that. ;) autogen.sh -u may work?
15:18 Bmagic_ tsbere: ok, that did fix that
15:19 tsbere Bmagic_: In that case your "4 times" issue is likely fixed
15:20 Bmagic_ tsbere: testing
15:25 Bmagic_ tsbere: yep, solved
15:25 Bmagic_ tsbere++
15:26 Bmagic_ tsbere: can you help me understand why the DB returned those rows in that function?
15:26 Bmagic_ (I don't think we run autogen with -u switch)
15:26 tsbere Bmagic_: Your org unit proximities were doubled up for some reason. You should only have one row for each from/to combination, you had 4, but likely only one for each org unit to itself.
15:27 tsbere Thus, when checking in at the "home" library you got one back. When checking in anywhere else you got four. Thus, "extra" holds. The autogen.sh script, when run with -u, *empties* the table and re-builds the proximities.
15:28 Bmagic_ tsbere: if we pass -u to autogen, does it do everything it does without the -u switch?
15:29 * berick wonders how they get doubled up
15:29 tsbere Bmagic_: Generally you don't need to pass -u to autogen. Something went wrong for you in the past, creating the extra entries. In this case passing -u to tell it to do that extra bit was a "fix the broken table contents" step.
15:31 dbwells tsbere++
15:38 Bmagic_ LOL it turns out the extra look ups didn't contribute much to the total amount of time!! 13.860 down to 10.929 seconds
15:38 Bmagic_ DB caching maybe?
15:42 hopkinsju joined #evergreen
15:42 Bmagic joined #evergreen
15:43 Bmagic jeff: so I guess this is not a bug?
15:44 jeff i'm with berick in wondering how they were doubled up.
15:44 Bmagic that would be nice to know. Perhaps through an upgrade?
15:44 jeff i don't know how worth it it would be to do either/both of 1) preventing the possibility of them being doubledup and/or 2) preventing their doubling-up from impacting that process
15:44 Bmagic Like I said, I don't remember the last time we ran autogen -u - probably 2 years
15:46 jeff bringing settings-tester (or something like it) back in vogue and teaching it things like "hey, your org unit proximities are doubled -- you should run autogen.sh -u" :-)
15:48 doug_evanced joined #evergreen
15:49 * jeff senses a SIP question
15:58 jeff ...or possibly not.
15:58 jeff greetings, doug!
16:00 doug_evanced hi, just browsing
16:02 julialima_ left #evergreen
16:11 * dbs thought "-u" was turned into a no-op years ago
16:12 dbs guess not!
16:34 eby joined #evergreen
16:45 mdriscoll1 left #evergreen
17:03 pinesol_green Incoming from qatests: Test Success - http://testing.evergreen-ils.org/~live/test.html <http://testing.evergreen-ils.org/~live/test.html>
17:07 mmorgan left #evergreen
18:16 mrpeters left #evergreen
20:09 maryj joined #evergreen
20:34 dcook joined #evergreen
21:21 finnx joined #evergreen
22:23 finnx left #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat