Evergreen ILS Website

IRC log for #evergreen, 2017-01-17

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
00:43 bmills joined #evergreen
05:01 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
05:21 gmcharlt joined #evergreen
06:40 rlefaive joined #evergreen
07:14 rjackson_isl joined #evergreen
07:15 JBoyer joined #evergreen
08:19 rlefaive joined #evergreen
08:28 collum joined #evergreen
08:41 mmorgan joined #evergreen
08:58 jvwoolf joined #evergreen
09:01 yboston joined #evergreen
09:02 kmlussier joined #evergreen
09:21 kmlussier joined #evergreen
09:26 Dyrcona joined #evergreen
09:28 jonadab joined #evergreen
09:47 agoben joined #evergreen
09:56 agoben joined #evergreen
10:06 csharp we're having significant issues after our upgrade to 2.11.1 - some of which is attributable to very long running queries from the action.summarize_all_circ_chain() function
10:06 csharp we need an index somewhere I think, but I don't know where
10:06 berick csharp: iirc, it's pulling data from aged_circulation now
10:07 csharp right - and action.circulation and action.aged_circulations are huge and are getting seq scans
10:10 berick got the original query and/or analyze?
10:10 csharp I'm running an explain analyze now on our test server - coming soon (or not)
10:10 csharp argh - it's not giving me any useful data
10:11 csharp http://pastebin.com/SHumZdwj
10:11 JBoyer You'll likely have to run some of the queries inside it by hand to see what's really happening. postgres really doesn't explain functions. :/
10:12 berick yeah...
10:14 csharp it's action.all_circ_chain, and it contains 2 loops
10:15 csharp select * from action.all_circulation where id = blah; comes back quickly
10:15 csharp I notice that the function is "volatile" - doesn't that mean that it has the power to change data?
10:16 berick what about WHERE parent_circ = blah ?
10:16 JBoyer csharp, volatile means that you can call the function with the same inputs and get different outputs. it helps the optimizer make decisions.
10:16 csharp that's probably it
10:16 csharp it's hanging forever now
10:17 berick yeah, no index on that for aged_circ
10:17 csharp JBoyer: ok - thanks
10:17 csharp so add an index for parent_circ?
10:17 berick yeah
10:17 csharp ok - thanks
10:19 JBoyer csharp, I don't suppose you've heard about issues with bib merging? We've had some take up to 10 minutes but haven't been able to narrow down the cause yet and am wondering if it's just a local issue.
10:20 csharp ERROR:  "all_circulation" is not a table or materialized view
10:21 csharp JBoyer: I have definitely heard about bib merging issues
10:21 csharp not just you
10:21 berick csharp: on aged_circulation
10:21 csharp argh - thanks
10:21 * mmorgan has encountered bib merging issues as well, just this morning. Seems to be related to moving the holds.
10:22 csharp berick: in other news, we're live with the new hold targeter - a few of our libraries have noticed far smaller than normal pull lists - we're going to wait and see what's going on there
10:22 * JBoyer wonders (though hasn't checked) if the new reporter.hold_request_record interacts oddly?
10:23 berick csharp: ok, thanks, keep me posted
10:25 mmorgan 30 seconds to transfer one title level hold.
10:30 krvmga joined #evergreen
10:31 csharp also seeing lots of issues with auth - but I want to fix the circ issue before moving to that
10:31 csharp like some staff and patrons are not able to log in and we're getting a steady flow of No authentication seed found. open-ils.auth.authenticate.init must be called first  (check that memcached is running and can be connected to)
10:32 JBoyer mmorgan, that would certainly explain why some are nigh instant and others are forever long.
10:32 csharp we upped the number of auth and auth_internal children to 25 each
10:32 JBoyer especially since I messed with asset.merge_record_assets locally to ignore deleted items and calls and fulfilled or canceled holds.
10:33 mmorgan csharp: I feel your pain!!! exactly the same auth thing here!
10:35 csharp mmorgan: really?
10:35 csharp mmorgan: did you upgrade this weekend too?
10:35 berick csharp: were you seeing 'no children' messages for any of the auth services in the warning logs?  or was that more just a sanity boost?
10:35 csharp I was seeing no children available
10:35 JBoyer csharp, mmorgan, is every machine that needs to auth users running the auth_internal service and are they all pointed at the same memcache servers? That's not something I've seen. (I did have a lot of issues when I forgot to turn on auth_internal at the bottom of opensrf.xml though)
10:36 mmorgan csharp: We upgraded last weekend. didn't see the auth issue until this morning.
10:36 Dyrcona Best thing is to run auth_internal on all of the servers that run drones.
10:36 csharp JBoyer: yes, all machines needed it are running it and they are pointed to the same 2 non-redundant servers
10:37 Dyrcona Probably don't need it on brick heads if you're split up that way.
10:37 krvmga are there issues with adding google mobile phone service to the sms carrier list?
10:39 * csharp wonders if the 2 separate memcache servers is an issue with the new auth services
10:39 JBoyer I figured. are the two servers specified in the same order in every instance? (I'm wondering about hashing differences in where memcache chooses to store / look for keys)
10:39 csharp JBoyer: same order, yes
10:39 Dyrcona We have a similar setup on 2.10 and have not had issues.
10:39 berick caching should not be any different w/ the new auth code.  (it all uses the same C cache client)
10:39 JBoyer Shoot. Well, you said you wanted to get through circ anyway, I'll stop grasping at straws.
10:40 berick running out of children is, of course, a problem
10:41 bshum krvmga: If there's an email text gateway for the SMS config, i don't see why not.
10:42 bshum But I haven't tested it personally in any way.
10:42 bshum And who knows with google voice, vs. I assume you mean something like Project Fi users
10:42 berick csharp: were you also seeing no-children for both open-ils.auth and auth_internal?  it may just be that the login process takes longer now, just enough to throw off the delicate balance of max_children
10:42 csharp berick: just auth, not auth_internal
10:42 berick ok
10:43 berick that, then I bet that's what it is.  it's just taking longer
10:43 berick which is somewhat expected
10:43 berick w/ the newfangled hashing
10:44 berick fortunately, auth and auth_internal require minimal resources
10:46 krvmga bshum: thanks. i'm still trying to figure that out :)
10:46 bshum krvmga: yeah I just read over this article about Project Fi integration for SMS/email  -  https://support.google.com/fi/answer/6356597?hl=en
10:47 bshum But I kind of hate it, given the precarious nature of hangouts vs. sms vs. google voice vs. whatever else google is doing under the hood with all that.
10:47 krvmga yes, thanks. i was looking at the same.
10:47 bshum They keep changing their minds over there.
10:47 Christineb joined #evergreen
10:48 bshum Live testing, whee
10:54 csharp berick++ # adding the parent_circ index to aged_circulation works - now waiting for index creation to finish on our overloaded live DB :-)
10:54 berick awesome!
10:54 berick csharp: if you open a bug (when you have time) i'll pitch in
10:54 csharp I'll create a bug report for it after we're further out of the woods
10:54 csharp thanks
11:05 brahmina joined #evergreen
11:05 JBoyer berick, do you think with the aged_circ stuff being used in more places it would make sense to just duplicate the indexes used on circ (where applicable?)
11:06 berick JBoyer: without looking too closely, I'd guess action.circulation has indexes that would never get used on aged_circ
11:07 berick but there may be more we need to copy over
11:09 berick so 12 vs 7 (8 with chris's change) indexes
11:09 berick closer than I thought
11:10 berick and i guess we don't need only_one_concurrent_checkout_per_copy or circ_open_date_idx or circ_open_xacts_idx or circ_outstanding_idx
11:10 JBoyer I was just comparing the list and was called away. :) I figured there may be some that aren't even possible (against patron info maybe?) so not a complete copy.
11:11 berick which means with chris's change, we should be good.  i think it was the only vital one missing
11:11 berick oh, right, and circ_all_usr_idx
11:11 JBoyer Ok, sounds good
11:14 csharp so should we add a usr index too?
11:15 JBoyer Can't, that's not in aged_circulation.
11:15 berick csharp: no usr on aged_circ
11:20 * dbs amuses self with bug description for 1657171
11:20 bshum bug 1657171
11:20 pinesol_green Launchpad bug 1657171 in Evergreen "ASCII apostrophe and Unicode right single quotation mark should be normalized" [Undecided,New] https://launchpad.net/bugs/1657171
11:21 csharp ah - duh, of course :-/
11:22 dbs it's not _that_ amusing of course
11:23 * dbs sees CFP has been extended to 2017-01-20, ponders COV flight...
11:25 berick dbs: i'd drive by and pick you up, but that would extend my trip by about a month
11:27 Dyrcona The airport code is CVG, btw.
11:27 dbs Oh, CVG / COV, right.
11:27 * Dyrcona once had a flight from CVG to CDG. :)
11:27 dbs I was just in CVG in November, for a conference in Cinncinnati. Lovely!
11:29 jeff @weather kcvg
11:29 pinesol_green jeff: Cincinnati-Northern KY International, KY :: Mostly Cloudy :: 63F/17C | Tuesday: Showers early, then cloudy in the afternoon. Thunder possible. Morning high of 60F with temps falling to near 50. Winds WSW at 10 to 20 mph. Chance of rain 40%. Tuesday Night: Cloudy. Low 39F. Winds W at 10 to 15 mph. | Updated: 37m ago
11:30 Dyrcona That's warm for this time of year.
11:30 Dyrcona @weather
11:30 pinesol_green Dyrcona: Methuen, MA :: Clear :: 42F/6C | Wind Chill: 40F/5C | Tuesday: Partly cloudy this morning. Increasing clouds with periods of showers this afternoon. High 41F. Winds NE at 5 to 10 mph. Chance of rain 60%. Tuesday Night: Cloudy with periods of rain. Some sleet or freezing rain possible. Low 34F. Winds ENE at 5 to 10 mph. Chance of rain 100%. Rainfall near a half an inch.
11:54 sandbergja joined #evergreen
11:56 Bmagic Jboyer mmorgan - yes we had slowness when merging bibs. It turned out to be reporter.hold_request_record This is a new table introduced in Evergreen 2.11
11:57 Bmagic In previous versions of Evergreen, that information was stored in a "view" instead of a table.
11:57 Bmagic This issue was resolved by reverting the table back to a view.
11:58 Bmagic csharp ^^ this might be part of your issue as well
11:58 sallyf joined #evergreen
12:00 mmorgan Bmagic: Interesting. Thanks for the tip.
12:03 csharp wait, so we drop the table and re-create the view?
12:04 csharp ok we're back up, and the circ table hang is fixed, but we're still getting "Gateway received error: No authentication seed found. open-ils.auth.authenticate.init must be called first  (check that memcached is running and can be connected to)" at intervals of once every 2-3 minutes
12:05 csharp that error used to pretty much *only* indicate a down memcached server or the like...
12:06 csharp could it be that the error message is kind of a red herring and a down memcached server is not really the issue? (because both servers are up and most people aren't having login trouble at the moment
12:06 berick csharp: still having no-children warning logs?
12:07 csharp berick: nope
12:08 Bmagic csharp: yes, we dropped the table and created the view, I didn't report it but it looks like I should have. I think I brought it up in IRC in late November
12:09 mmorgan fwiw, we have not had the authentication seed error here for about an hour. We haven't seen no-children warnings.
12:10 csharp Bmagic: thanks - I'll look into that
12:14 berick csharp: mmorgan: we get several 'no auth seed' messages an hour on 2.7. got almost 100 in the last hour. no reports of login problems.
12:15 berick fwiw
12:16 berick if the numbers drastically increased after the 2.10+ upgrade, that's probably not good.  if they've always been there, it may not be a problem.
12:17 berick why they're there, I can't say. have not researched further when it happens here yet
12:17 bmills joined #evergreen
12:19 berick one  cause would be clients taking too long between auth steps (30 second seed timeout, IIRC) or multiple logins for the same username/barcode at roughly the same time w/o a nonce value, which is only implemented in SIP and some perl scripts
12:22 jihpringle joined #evergreen
12:23 mmorgan berick: The 'No authentication seed found' log entries are not something we've seen before, in recent memory, anyway. When we see them, users are unable to authenticate.
12:24 berick mmorgan: gotcha
12:26 berick then one of the above 2 scenarios or a full cache server are all possibilities
12:26 JBoyer Something to think about for the auth_seed errors: If you've increased the number of processes that use a memcache connection, there is a connection limit and it's not very high by default. csharp mentioned bumping the number of auth processes, so that increase x the number of servers involved might finally push you over the limit.
12:26 JBoyer And I think that type of error was all that I saw, I'm not sure if the memcache lib was returning an error exactly.
12:27 berick JBoyer++ good point
12:27 JBoyer (That, plus the longer time needed to complete an auth that berick mentioned, etc.)
12:28 JBoyer Connection limits come to mind easily for me because I've hit both memcache AND postgres conn limits before. Both are bad, but one is significantly less good than the other. ;)
12:28 * berick likes the kind that die the loudest
12:29 JBoyer so long as it's not "Hark, look ye over there!" followed by *dies of unrelated error*
12:30 csharp or "I died before I could emit a useful error telling you why"
12:33 csharp I upped the memcache connection limit to 4096 per machine - right now were at 1332 and 812
12:33 csharp s/were/we're/
12:33 berick csharp: and plenty of free space within memcache?
12:34 JBoyer Ah, isn't the default something like 1024?
13:02 maryj joined #evergreen
13:09 csharp berick: yep - 8GB per machine, each is hovering around 1.6G
13:10 csharp JBoyer: yep - we upped the limit to 4x default during our post-upgrade craziness last year
13:10 csharp btw, all is very very calm since adding the aged_circ index
13:11 csharp y'all come on by: http://gapines.org
13:11 * JBoyer is about to jump on that index too.
13:11 * csharp had to pg_terminate_backend the lingering hung queries before create index concurrently got to work
13:12 JBoyer And I think the right way to deal with reporter.hold_request_record is probably indexes, not necessarily making it a view again. (though it does seem to have mad churn...)
13:12 csharp (opensrf was stopped everywhere, for the logs)
13:12 csharp I was going to look into alternatives to reverting the feature - still haven't looked it up to see the rationale
13:13 JBoyer Even if it's seq scan'd all day it is such a small table I've had no qualms about doing a vacuum full or reindex on it in the middle of the day; it may take some digging.
13:17 csharp df0d7636
13:17 pinesol_green csharp: [evergreen|Galen Charlton] LP#1549505: update baseline database schema - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=df0d763>
13:17 csharp bug 1549505
13:17 pinesol_green Launchpad bug 1549505 in Evergreen "Statistically generated record ratings" [Wishlist,Fix released] https://launchpad.net/bugs/1549505
13:24 mmorgan joined #evergreen
13:44 csharp okay - the hang in bib merges definitely happens when moving the T holds from one record to the other
13:45 csharp probably true of any holds
13:50 Dyrcona I seem to recall we've had that conversation before, but maybe I'm imagining things again.
13:55 rlefaive joined #evergreen
13:59 rlefaive_ joined #evergreen
14:12 csharp okay, added indexes to all possible fields on reporter.hold_request_record, but still hanging
14:14 berick csharp: did you see the Bmagic's comment around 11:56 ?  sounds like it may be related to reporter.hold_request_record
14:14 berick oh
14:14 berick heh
14:14 berick sorry
14:14 berick i misread what you wrote
14:15 Bmagic csharp: I believe I added indexes before ultimately going back to a view
14:15 csharp gmcharlt: does removing the table and re-adding the view kill the feature in bug 1549505?
14:15 pinesol_green Launchpad bug 1549505 in Evergreen "Statistically generated record ratings" [Wishlist,Fix released] https://launchpad.net/bugs/1549505
14:15 Bmagic I needed to stop the bleeding asap, as we were live
14:16 csharp right, understood
14:17 csharp or miker ^^?
14:19 miker csharp: yes, it kills the feature ...
14:19 csharp miker: any suggestions on optimizing this so we don't hafta do that? :-)
14:19 miker looking
14:19 csharp thanks
14:20 * csharp tries to create a query similar to what the function is doing for explain analyze
14:22 csharp nope - doing a straight UPDATE works very quickly
14:23 berick looking at reporter.hold_request_record_mapper .. am I missing something, where's the WHERE clause in the UPDATE branch?
14:23 gmcharlt berick: yep, that's the issue
14:24 gmcharlt patch is coming momentarily
14:24 berick aha, gmcharlt++
14:24 csharp I noticed that too, but figured it was only getting called a specific ID
14:24 miker csharp / Bmagic: http://paste.evergreen-ils.org/40
14:24 miker branch forthcoming
14:25 csharp yep, that does it!
14:25 berick miker++
14:25 csharp miker++
14:27 maryj_ joined #evergreen
14:28 csharp miker: Elaine asked me to pass on her thanks ;-)
14:29 maryj joined #evergreen
14:46 Dyrcona So the load balancer is supposed to balance load, but tell me why 1 brick in particular always seems to get hammered more than the others....
14:47 bshum Dyrcona: What's the method being used?
14:47 Dyrcona I believe it is round robin.
14:57 csharp 52bf3fe8
14:57 pinesol_green csharp: [evergreen|Bill Erickson] LP#1497335 Aged/All circulation API access - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=52bf3fe>
14:58 csharp bug 1497335
14:58 pinesol_green Launchpad bug 1497335 in Evergreen "Display aged circulations for copies (was: "virtually aged circulations")" [Wishlist,Fix released] https://launchpad.net/bugs/1497335
14:58 gmcharlt berick: kmlussier: whomever - bug 1657237 signed off and ready for review and merging
14:58 pinesol_green Launchpad bug 1657237 in Evergreen "Trigger function maintaining hold-target mapping not well constrained" [Critical,Confirmed] https://launchpad.net/bugs/1657237 - Assigned to Galen Charlton (gmc)
15:05 berick gmcharlt: cool, i'll grab it
15:06 * berick grabs 1004
15:13 csharp berick: while you're at it, I opened bug 1657241 too
15:13 pinesol_green Launchpad bug 1657241 in Evergreen "action.aged_circulation needs a parent_circ index" [High,Confirmed] https://launchpad.net/bugs/1657241
15:16 pinesol_green [evergreen|Mike Rylander] LP#1657237: Properly constrain matview trigger function - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=bdae986>
15:16 pinesol_green [evergreen|Bill Erickson] LP#1657237 Stamping rhrr mat view trigger repair - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=8ba7da7>
15:20 berick csharp++
15:22 * berick grabs 1005
15:28 kmlussier miker++ gmcharlt++ berick++
15:30 david___ joined #evergreen
15:30 david___ Hello
15:31 david___ How can I find out what version of XML module for Perl is used by Evergreen version 2.9.6
15:34 mmorgan miker++ gmcharlt++ berick++
15:34 pinesol_green [evergreen|Chris Sharp] LP#1657241 - Add parent_circ index to action.aged_circulation - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=a688fb8>
15:34 pinesol_green [evergreen|Bill Erickson] LP#1657241 Stamping aged circ parent_circ index - <http://git.evergreen-ils.org/?p=​Evergreen.git;a=commit;h=70b406c>
15:36 csharp berick++
15:38 berick david___: there is more than one XML perl module in use.  arguably the most important is XML::LibXML.  you can see what's installed via command line with this:  perl -MXML::LibXML -e 'print "$XML::LibXML::VERSION\n"'
15:43 kmlussier csharp++ berick++
15:48 JBoyer csharp++
15:48 JBoyer berick++
15:49 JBoyer The Indiana catalogers are going to throw a party; they haven't been able to merge records reliably for over 2 weeks.
16:02 berick JBoyer: go ahead and call the paddy wagon.  it promises to be a rager.
16:09 Bmagic miker++ gmcharlt++ berick++ csharp++
16:12 mmorgan Oh yeah. csharp++
16:16 david___ berick: I want to make sure a known vulnerability isn't still exploitable
16:18 david___ berick:  another source just told me it is using version 1.0.3 which is a good thing...  I wasn't sure because Evergreen 2.9.6 is not even the latest version for series 9
16:19 berick david___: in a vulnerability in what exactly?
16:20 Dyrcona david___: What version of LibXML::XML you have depends more on your Linux distro than it does Evergreen version.
16:21 berick also that
16:21 Dyrcona It's usually installed from a package: libxml-xml-perl or similar.
16:23 david___ Thanks for the input...
16:24 david___ left #evergreen
16:50 tsadok joined #evergreen
17:01 pinesol_green News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
17:04 mmorgan left #evergreen
18:08 rlefaive joined #evergreen
19:25 _adb joined #evergreen
19:25 _adb1 joined #evergreen
19:39 Dyrcona Oh, hey! I'm still signed in.
19:40 Dyrcona So, that excessive apache memory use that have been seen on Apache 2.4, I think I'm seeing on Apache 2.2.
19:41 Dyrcona Looking at my bricks, I've got several apache drones using over 700MB of resident memory.
19:42 Dyrcona Eight of them (total) on two of the hosts are using 1GB+ of virtual memory.
19:43 Dyrcona We've had 1 brick out of rotation in the load balancer today and had some interesting behavior.
19:44 Dyrcona Usually erlang uses the most memory on the servers.
19:52 tsadok joined #evergreen
20:29 jihpringle joined #evergreen
20:29 gmcharlt joined #evergreen
20:33 bmills joined #evergreen
20:34 bmills joined #evergreen
20:34 bmills joined #evergreen
20:35 bmills joined #evergreen
20:36 bmills joined #evergreen
20:37 bmills joined #evergreen
20:38 bmills joined #evergreen
20:55 remingtron joined #evergreen
21:37 tsadok joined #evergreen
22:02 tsadok joined #evergreen
22:07 tsadok joined #evergreen
22:25 kmlussier joined #evergreen
22:53 kmlussier @sortinghat
22:53 pinesol_green Hmm... kmlussier... Let me see now... RAVENCLAW!
22:54 jeff @weather --forecast ktvc
22:54 pinesol_green jeff: Forecast: :: Tue: Fog (37F/32F) | Wed: Partly Cloudy (41F/31F) | Thu: Partly Cloudy (44F/32F) | Fri: Chance of Rain (43F/35F)
22:54 jeff @weather --alerts ktvc
22:54 pinesol_green jeff: Alerts: ::  ...Side roads and untreated roads very slick... As temperatures fall overnight, water on snow covered and untreated areas will re-freeze. This will create very slick conditions. Use caution if traveling overnight, as icy spots may be difficult to see. Jk 1030 PM EST Tue Jan 17 2017 ...Side roads and untreated roads very slick... As temperatures fall overnight, water on snow (4 more messages)
22:54 jeff !
23:01 kmlussier Those alerts would keep me off the road!
23:10 jeff the "4 more messages" part is... ominous.

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat