IRC log for #evergreen, 2016-11-14

All times shown according to the server's local time.

Time	Nick	Message
05:00	pinesol_green	News from qatests: Test Failure <http://testing.evergreen-ils.org/~live>
07:18		rjackson_isl joined #evergreen
07:18		agoben joined #evergreen
07:21		Callender joined #evergreen
07:29		JBoyer joined #evergreen
08:40		finnx joined #evergreen
08:44		mmorgan joined #evergreen
08:45		mmorgan joined #evergreen
08:47		mmorgan joined #evergreen
08:51		Dyrcona joined #evergreen
08:57		bos20k joined #evergreen
09:21		yboston joined #evergreen
09:22		collum joined #evergreen
09:48		mdriscoll joined #evergreen
09:48		kmlussier joined #evergreen
10:03		mmorgan1 joined #evergreen
10:05		sandbergja joined #evergreen
10:15	kmlussier	Good morning #evergreen!
10:17	Bmagic	Good morning!
10:18	Bmagic	Anyone out there have a 2.11 server handy? I would like to know if the pull list print alternate strategy works for you
10:19	bshum	We should just deprecate those already :)
10:19	bshum	I mean remove
10:19	bshum	If they aren't already deprecated?
10:20	Bmagic	I guess not
10:21	phasefx	never again (re: plethora of almost-redundant interfaces) :)
10:21	Dyrcona	Well, deprecated and removed are different things.
10:23	Dyrcona	My opinion: Just leave it in the XUL client but don't add it to the browser staff client.
10:23	* kmlussier	was just about to say the same thing as Dyrcona
10:23	kmlussier	But he beat me to it.
10:24	Dyrcona	I win for a change. :)
10:24	kmlussier	Right now, there is just one pull list in the web client. But, there are still some features from the simplified pull list that haven't made it to the web client that I hope get implemented.
10:25	kmlussier	Dyrcona: That's because I was just heading back to my desk when the conversation started. :)
10:25	Dyrcona	Heh.
10:28		jvwoolf joined #evergreen
10:28		mmorgan joined #evergreen
10:29	kmlussier	Also, I don't think sorting is available in the web client pull list yet.
10:29	berick	https://bugs.launchpad.net/evergreen/+bug/1437104
10:29	pinesol_green	Launchpad bug 1437104 in Evergreen "some columns are not sortable in the web staff client" [Undecided,Confirmed] - Assigned to Stephen (smoss-e)
10:30	berick	still need to finish / merge the holds pull list
10:30	berick	code
10:30	kmlussier	berick: Yeah, I was just looking at your comment here https://bugs.launchpad.net/evergreen/+bug/1437104/comments/1
10:30	berick	hmm, should remove the assignee.. been over a year
10:34	Bmagic	open-ils.circ.hold_pull_list.print.stream: Use of uninitialized value in int at /usr/local/share/perl/5.22.1/OpenILS/Application/Circ/Holds.pm line 1688
10:36	Dyrcona	In master, that is removing the org unit id from the params.
10:37	Dyrcona	In 2.11, that might be something different.
10:38	Dyrcona	It's also very likely innocuous: delete($$params{org_id}) unless (int($$params{org_id}));
10:38	Dyrcona	I guess it is really an attempt to make sure that org_id is an integer, and it is apparently not being passed.
10:39	Dyrcona	If it's not that line in 2.11, then it is something of a very similar pattern.
10:41	Dyrcona	Adding a defined($$params{org_id}) && inside the unless, before the int() call would make the message go away. ;)
10:44	Dyrcona	If you're not getting the org_id, then check that they've actually registered a workstation.
10:50	Bmagic	Dyrcona: I'm seeing those same lines. Those are the lines that are in production right now
10:51	Dyrcona	Bmagic: OK. So the org_id is not getting passed. That would be in the staff client somewhere.
10:52	Dyrcona	Maybe the user's work_ou is not set properly?
10:52	Bmagic	hmmm
10:52	* Dyrcona	just speculates. I don't know that part of the staff client.
10:53	Bmagic	It's happening everywhere, all branches. Im digging. It's working on my test server....
10:55	Dyrcona	Hmm. Maybe changes for the browser staff client or something happened during the upgrade?
10:56	Bmagic	alt_holds_print.js are identical
10:57	Dyrcona	Well, that's the end of my speculations.
10:57	Bmagic	Holds.pm are identical
10:58	Bmagic	it's gotta be somewhere else
10:58	Bmagic	db function??
10:58	Dyrcona	I doubt it's a db function, because the error is before you get to the db.
11:06	Bmagic	how about this
11:06	Bmagic	File does not exist: /openils/var/web/js/dojo/fieldmapper/OrgLasso.js
11:07	Dyrcona	That might be it.
11:08	Bmagic	hmm, well that file is missing on the test server too
11:19	kmlussier	Bmagic: I don't know if this is what your users are seeing, but I just tried to use the alternate strategy on a VM with master. I just get a progress bar that hangs, and no titles load.
11:19	Bmagic	that is what we are seeing
11:19	Bmagic	however, my test machine with the exact same data has no issues.....
11:20	kmlussier	There was one test in which 10 of the 41 titles did load successfully, but the progress bar hung after that.
11:20	Bmagic	Obviously, my test machine has a different set of code on it, I'm digging to find the differences
11:21	kmlussier	If you end up filing a bug, I can confirm it. I'm not sure where you saw that error, so I don't know if we get the same error.
11:22		Christineb joined #evergreen
11:23	Bmagic	kmlussier: thanks
11:28	Bmagic	ok, my test server used rel_2_11 and the production machine used tags/rel_2_11_0
11:41	csharp	Bmagic: my 2.11.0 test server just pops up with an alert that says "No Results"
11:42	Bmagic	csharp: yeah, you need to have at least one item on the pull list
11:43	csharp	trying a branch with items
11:45	csharp	Bmagic: yep - same behavior - same errors in the osrfwarn.log
11:45	Bmagic	ah, ok then
11:45	Bmagic	are you on master?
11:46	Bmagic	or tags/rel_2_11_0 ? or just plain rel_2_11 ?
11:48	csharp	2.11.0
11:48	csharp	(the tarball release)
11:55	Bmagic	I see
11:56	Bmagic	Somehow it's not broken on my test machine. But I will report the bug on launchpad after this conference call
12:04	csharp	Bmagic: here's the call I see: open-ils.circ open-ils.circ.hold_pull_list.print.stream "<authtoken>", {"org_id":null,"limit":null,"offset":null,"chunk_size":null,"sort":["acplo.position","prefix","call_number","suffix","request_time"]}
12:04		jihpringle joined #evergreen
12:04	csharp	"org_id":null seems to be the relevant part
12:04	Bmagic	yep, that's got to be it
12:05	csharp	so the JS layer is missing the org id for some reason
12:06	Bmagic	I wonder if this is related to the thing that Dyrcona and miker and I were troubleshooting at the hack-a-way - with the introduction of badges
12:07	Bmagic	bug 1639236
12:07	miker	Bmagic: that was specifically about the internal structure of search results as they pass through the biz logic ... so, I don't think it's likely to be related
12:07	pinesol_green	Launchpad bug 1639236 in Evergreen 2.11 "Temporary List Display Broken in 2.11/Master" [Medium,Confirmed] https://launchpad.net/bugs/1639236
12:08	Bmagic	org_unit is null.... sort of the issue for the my lists as well
12:12		brahmina joined #evergreen
12:19	dbwells	'org_id:null' is true for us on 2.10.x, but not causing problems. (We get the warnings, but the page loads fine.) That doesn't seem likely to be the problem.
12:21	kmlussier	dbwells: The alternate print page is loading fine for you? Huh.
12:22	kmlussier	I wonder why it's not working for others, then.
12:22	jeff	oh. we had a problem with an interface or two after our 2.10 upgrade. i should remember what that issue was.
12:23	jeff	(could be entirely unrelated to issue-at-hand. i haven't been following closely.)
12:23	* jeff	stops thinking out loud for a moment
12:26	jeff	nope, records (seem to) indicate that i was thinking about something else.
12:30	Bmagic	lol
12:47		hbrennan joined #evergreen
12:56	csharp	Bmagic: hmm - strangely, the holds pull list is working for me know
12:56	csharp	s/know/now/
12:56	Bmagic	ha!
12:56	Bmagic	I have got some reports that it's working for some people as well (sometimes)
12:56	csharp	that makes me wonder whether it was a cold DB cache issue
12:56	csharp	maybe the request timed out?
12:56	JBoyer	'Twas a sunspot, nothing more, rapping, rapping at my hard drive's door.
12:56	* csharp	goes to his logs
13:00	Dyrcona	JBoyer++
13:02	csharp	I can see in the logs that it pulled up all the data, but since the "streaming" part died, it apparently had nowhere to go
13:03	* kmlussier	tries the holds pull list again, still gets stuck progress bar.
13:04	csharp	and the same WARN log messages appear with the "working" list
13:08	jeff	hrm. there's really no "time marked lost/longoverdue", short of assuming that's the time the copy was edited (when the status changed to Lost / Longoverdue).
13:08	jeff	well, the billing timestamp for the lost charge, perhaps.
13:09	jeff	the circ's stop_fines gets overwritten to LOST (from null or MAXFINES, etc) -- but the stop_fines_time is not overwritten (if present -- i'd have to look to confirm that it's set if null)
13:11	csharp	jeff: we use the edit_date and haven't seen problems with that
13:12	kmlussier	I asked Dyrcona this question a few days ago, but want to throw it out to other former RMs and buildmasters. For the past few releases, we've been scheduling major releases to fall on the same day as the point releases.
13:12	kmlussier	Has that worked out well for you or do you find it leads to too much churn happening all on the same day?
13:15	kmlussier	I discovered a very old wiki page, that I have since deleted, where it looks like we had previously targeted major releases to the week after point releases. https://wiki.evergreen-ils.org/doku.php?id=release_schedule&rev=1357756963
13:21	bshum	Personally I like having releases more closely together, to avoid situations where a bug fix in one series doesn't get missed in the release for another series. Keeping things more lock step for that reason so that you can more easily say "bug is fixed as of this set of releases" made more sense to me
13:22	bshum	Than trying to trace back where/when did a fix get released
13:22	bshum	But my opinion is low :)
13:49	Bmagic	Another question I have - when load balancing SIP servers, each SIP server needs to know about one another's "sessions" - so it doesnt matter which SIP server is assigned. I need personality='Multiplex' ? Is that all I need to change?
13:51	berick	Bmagic: setting the personality to Multiplex is an option, but it's not a requirement
13:51	berick	it has no effect on load balancing
13:52	Bmagic	berick: my theory for the SIP issues right now is that one SIP server receives the login and the subsequent messages are sent to another server in the pool causing the login prompt once more
13:53	berick	Bmagic: SIP communication occurs over a long-lived TCP connection. once connected to a SIP server, the client sends all communcation to that server until it's done and disconnects.
13:54	Bmagic	right, but the load balancer that I am using doesnt care - it sends requests wherever
13:54	berick	it's not possible to truly load-balance every sip request, only the initial connections
13:55	berick	Bmagic: yeah, that won't work
13:55	Bmagic	ok, so there is nothing I can do to configure SIP to "look" at the other SIP servers via memcached maybe?
13:57	berick	possible in theory, but not as SIPServer is coded today. it assumes long-lived connections
13:58	Dyrcona	Clients are worse about it than SIPServer, actually.
13:58	Dyrcona	Most self checks that I've seen are written to open a single TCP connection and keep that connection open all day.
13:59	Dyrcona	A router dropping idle connections can cause them problems.
14:00	Dyrcona	We're load balancing SIP and don't have problems that I would attribute to load balancing.
14:00	jeff	Bmagic: are you trying to use a load balancer that maintains a long-lived TCP session to the client but drops the connectiotn to the backend, then tries to re-establish somehow without logging in?
14:01	jeff	Bmagic: stepping back, what does your load balancer look like and what issues are you having?
14:02	Bmagic	jeff: The libraries are having to login multiple times. I dont have evidence but my theory is that the load balanced SIP servers are requesting the login again because they were not the server that they were talking to originally
14:03	Bmagic	jeff: the load balancer in this case is handled by google cloud. I don't have much access to the logic
14:04	dbs	Bmagic: this sounds somewhat familiar, even though we're only running one SIP server.
14:04	Bmagic	I believe my solution will be to setup a separate SIP server all by itself and configure our libraries to that. I am doing that now with one library and it has cleared up the issue
14:05	dbs	In our case, we ran into trouble with the various timeout settings
14:05	Bmagic	jeff++ Dyrcona++ miker++ kmlussier++ csharp++ dbs++ # being patient with ME's upgrade to 2.11
14:05	dbs	as well as the oils_auth workstation timeout
14:06	jeff	Bmagic: what sip server personality are you using, and do your clients send routine ACS status messages (that serve as a keep-alive)?
14:07	Bmagic	dbs: our SIP config didnt change (other than using the latest SIPServer git branch). The server and load balancer did change though.
14:07	dbs	some of my mind-babbling through our problems around http://irc.evergreen-ils.org/evergreen/2016-10-12#i_272036
14:07	Bmagic	PreFork
14:07	Bmagic	not sure about the keep-alive. Would that show up in logs as a "99" request?
14:08	dbs	PreFork here, too. http://irc.evergreen-ils.org/evergreen/2016-10-12#i_272036 is what resolved most of the problems for us (the authtoken expiring)
14:08	dbwells	kmlussier: I think we should consider having milestone releases come shortly after the point releases, with the main issue being the upgrade scripts. It is desirable to have the major version upgrade target the latest point release, e.g. 2.10.6-2.11.rc-upgrade-db.sql. If that point release is the same day, the person building from master either has to wait or ends up producing a version which is immediately out of date. Just a factor to think ab
14:08	dbwells	out.
14:08	dbs	err http://irc.evergreen-ils.org/evergreen/2016-10-12#i_272068
14:08	berick	phasefx: was looking at the local storage in my browser and noticed the worklog stores the entire fleshed blob of transaction data in addition to the concise user/copy/hold/action data. was there a reason to store all the transaction data? would like to avoid cramming so much into storage if possible.
14:09	phasefx	berick: don't think so.. I'd be okay with it just storing what ultimately gets displayed
14:09	Bmagic	dbs: one clue here is when I reduce our SIP server pool to a single machine, the issue goes away
14:10	kmlussier	dbwells: Thanks for that feedback. Also, thanks bshum for his feedback, which I saw, but then didn't acknowledge. :)
14:10	berick	phasefx: ok, thanks. i'll make a note to revisit that
14:10	jeff	Bmagic: are you using what Google calls "Network Load Balancing", with target pools and forwarding rules?
14:10	Bmagic	jeff: yes
14:11	phasefx	berick++
14:14	phasefx	berick: it looks like just removing one line might do the trick, near "var entry" in ui.js
14:15	jeff	Bmagic: nothing there jumps out as being immediately problematic. have you confirmed that you don't just have a bad backend server, where every connection that hits that backend fails?
14:15	phasefx	berick: get rid of 'data' : data
14:15	berick	phasefx: cool, that's just where I was poking around. thanks for confirming.
14:15	berick	i'll give it a try
14:16	phasefx	rock
14:16	Bmagic	jeff: It could be that. I can't rule it out. I might have to pick this back up when things are less "fresh" and the libraries are calling/emailing
14:16	* jeff	nods
14:21	berick	phasefx: that did it. logging actions still work as expected
14:21	Bmagic	yall - since it seems like there is a good audience here. Missouri Evergreen is live on the Google cloud using docker containers. 2.11.0. The app bricks have autoscaled up and down all day. The oldest app server is 2 hours and the newest one is 33 minutes old
14:21	phasefx	berick++
14:22	berick	Bmagic++
14:22	berick	cool
14:22	Bmagic	No "real" issues with the hosting. Speed is nice. System is stable
14:22	hbrennan	bmagic++
14:22	phasefx	Bmagic: sweet
14:23	Bmagic	it's addicting to watch
14:24	Bmagic	queue the DDOS attack
14:26	Bmagic	it's interesting to see that we had 8 app servers before. Now when we let the system decide how many we need. It's only needing 2. I haven't seen it get over 4 today. LOL
14:27	kmlussier	Bmagic++ Nice!
14:28	Dyrcona	Bmagic: Would you care to share your experience in a presentation at the conference?
14:28	Bmagic	Dyrcona: that is a good idea
14:28	Bmagic	and I was struggling to come up with something to present. LOL
14:32	Dyrcona	Bmagic++
14:37	kmlussier	Bmagic / csharp: Update on our previous discussion. I just tried the alternate print pull list on a 2.10.7 server and saw the same issue there.
14:44	dbs	Bmagic: wow, I really really hope you document what you're doing!
14:44	dbs	also... what OS are your docker container guests?
15:02	Bmagic	dbs: 16.04
15:08	Bmagic	dbs: here is what I have published so far
15:08	Bmagic	https://github.com/mcoia/eg-docker
15:09	Bmagic	todo: actually document the process
15:10	kmlussier	90% of my todos are "document the process."
15:43	dbs	Bmagic++
15:57		RBecker joined #evergreen
16:57		jvwoolf left #evergreen
17:00	pinesol_green	News from qatests: Test Failure <http://testing.evergreen-ils.org/~live>
17:04		mmorgan left #evergreen
17:05	phasefx	incidentally, I tried to build a clean wheezy install I could run an actual report on (to test the last fix for the failure above), and ran into different issues with libdbi (where the debian-jessie way of doing it should work). Will still revisit
20:33		makohund joined #evergreen
20:56		hbrennan joined #evergreen
21:00		makohund left #evergreen