IRC log for #evergreen, 2019-02-26

All times shown according to the server's local time.

Time	Nick	Message
04:30		tsadok joined #evergreen
04:30		BigRig joined #evergreen
04:30		phasefx joined #evergreen
04:30		maryj joined #evergreen
04:35		dbs_ joined #evergreen
05:01	pinesol	News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>
06:14		Dyrcona joined #evergreen
06:15	Dyrcona	So, our fine generator got stuck twice last night. Nothing in the logs, literally, after 9 seconds from its start time.
06:16	Dyrcona	Well, nothing from the fine generator or the storage service.
06:16	Dyrcona	It also seems that when storage dies, we don't get "not connected to the network" errors when we attempt to connect to it. I'm going to look into that more today.
06:18	Dyrcona	ps -o time,etime on the one that is "running" show this: 00:00:08 10:17:48
06:19	Dyrcona	So, 8 seconds of CPU after 10 hours, 17 minutes of running.
06:21		Dyrcona joined #evergreen
06:28	Dyrcona	And, my laptop doesn't like my access point.
06:33		stephengwills joined #evergreen
06:37	csharp	jeez
06:38	Dyrcona	Another interesting thing is that while the fine generator is running, I almost never see it doing anything in the database. I sometimes catch the first query running.
06:38	csharp	I don't have specific ideas, but as we added more and more overnight processes to our utility server, they started competing for RAM/processors to the point where we created a second utility server to handle data exports, etc.
06:39	csharp	now we have 3: one for fines/holds/"normal" non-AT stuff, one for data exports/imports, and one solely for AT
06:39	csharp	but your thing doesn't sound like that
06:39	Dyrcona	We have 2 utility servers, but 1 is only used for SQL updates and z39.50.
06:39	csharp	yeah, that's like our "utility02"
06:40	Dyrcona	We have util (z39.50) and util2.
06:40	Dyrcona	And, util2 (which does most of the stuff) is using 1.3GB of swap at the moment. I'm starting to suspect it is a memory issue.
06:41	csharp	if you have the space for a third server, may at least be worth an experiment (or up the RAM if running within a VM with a beefy host)
06:43	csharp	our specs for utility servers are currently 32GB RAM and 16 CPUs (which is overkill for utility02, but since we have the resources, we can just do it)
06:44	Dyrcona	util2 is a physical server with 32 GB of RAM. util is a vm with 16GB.
06:44	csharp	gotcha
06:45	Dyrcona	I don't find any OOM messages in the logs on util2, though.
06:46	csharp	not sure if ours ever showed kernel OOM messages either...
06:46	Dyrcona	I'm going to move our exports to the util vm, or setup a new VM for them.
06:49	Dyrcona	The thing that puzzles me, is the trouble with this started at 8:00 pm, and the big export didn't happen until 10:30 last night.
06:49	Dyrcona	I wonder if it's the hold targeter, which starts about the same time. The hold targeter uses cstore, but if they're both using a lot of RAM from time to time....
06:55	Dyrcona	Not a lot of RAM being used by individual processes on the util2 server at the moment: 1 storage drone is using about 1MB and ejabberd is only using slightly more, which just seems wrong.
06:55	Dyrcona	i.e. too low for ejabberd.
06:57	Dyrcona	csharp: Where do you run reports? Do you have a vm just for Clark?
06:57		agoben joined #evergreen
07:03	Dyrcona	Even with the fine generator and hold targeter both going, systemd and top are using the most CPU on the util2 server.
07:05	Dyrcona	Disk isn't full, either.
07:05	Dyrcona	Only 15% in use.
07:11	Dyrcona	So, I have a zombie storage drone whose parent is another storage drone.
07:13	Dyrcona	And, drones don't handle SIGCHLD properly.
07:13		rjackson_isl joined #evergreen
07:17	Dyrcona	And, my fine generator has been spinning for an hour.
07:19	* Dyrcona	wonders why this would appear to get worse after improving the database configuration?
07:23	Dyrcona	Ok. Runing fine_generator on util, I can see it running some queries in the db.
07:24	JBoyer	Seems kind of worst case, but could you bump the logging up to debug on that one machine to see what happens to the storage service?
07:24	JBoyer	Of course, if it seems fine now that may not be a great idea.
07:24	Dyrcona	Well, I moved it.
07:25	Dyrcona	I see queires related to payments and closed days.
07:25	Dyrcona	I should add that when the storage service died last week, I had to restart it twice before it would work.
07:25	JBoyer	Hmm. Haven't run across that one.
07:26	JBoyer	Is there any monitoring that looks at osrf_control --diagnostic?
07:27	JBoyer	That wouldn't stop it falling over but it would be nice and loud shortly after it happens.
07:28	Dyrcona	No, there isn't. I often think we monitor the wrong things, but I've not had time to fix the monitoring because I have to work that out with another department.
07:28		jgoodson_ joined #evergreen
07:28	JBoyer	Oh, yeah. :/
07:29	JBoyer	Speaking pf which, I just looked at the ram usage on one of our app servers and it's such a beautiful sawtooth that I have to imagine there's a memory leak somewhere that I'm missing. :(
07:29	Dyrcona	Bugs me that storage was dead an 0 not connected to the network messages when trying to connect to it. We do monitor the logs for not connected messages.
07:29	JBoyer	(sawtooth between scheduled restarts that is)
07:30	Dyrcona	yeah.. I'm sure there are leaks all over the place...
07:30	Dyrcona	@blame Perl
07:30	pinesol	Dyrcona: Your failure is now complete, Perl.
07:30		tsadok_ joined #evergreen
07:30	Dyrcona	That would be Perl 6, pinesol.
07:30		dbs joined #evergreen
07:31	JBoyer	That sounds similar to something jeff or berick was seeing some time ago, where a drone was somehow being promoted to the listener but not actually doing it. I don't think that situation lead to NOT CONN... messages either.
07:31	Dyrcona	Well, that's a nasty bug, and it wouldn't surprise me if it is related to the drone's failure to handle SIGCHLD.
07:34	JBoyer	I thought it was addressed though. (I can't remember anymore if it was a "this will definitely take care of that" or "this should help")
07:34		bdljohn joined #evergreen
07:35	csharp	Dyrcona: yes, we have a machine (actually a bare metal one) that runs clark only
07:37	csharp	Dyrcona: this is a simple osrf_control --diagnostic monitoring script: https://git.evergreen-ils.org/?p=contrib/pines.git;a=blob;f=nagios/check_osrf;h=24f7b8b620dc16613bc23ea373d2e388fdefee9a;hb=HEAD
07:37	csharp	it just lets you know if something has stopped running
07:37	csharp	haven't improved it to care about percentages of drones but that's on the neverending to-do
07:40	Dyrcona	csharp: Thanks. I'll take a look.
08:17		stephengwills_ joined #evergreen
08:21		bos20k joined #evergreen
08:44		mmorgan joined #evergreen
08:53		dbwells joined #evergreen
09:10		remingtron joined #evergreen
09:21		RMiller joined #evergreen
09:23	RMiller	Vandelay question! The MARC batch import is matching and merging with records marked as deleted in the catalog, which I think is causing the item import to fail. How do I fix that?
09:25	RMiller	(It's not producing any actual item import errors; it just sees all the items in the queue and imports 0.)
09:27	Dyrcona	Well, you can undelete the records after they've been merged or fix the code so that Vandelay ignores deleted records (which I thought it did, and this is most definitely a bug).
09:28	RMiller	Can I add some kind of "deleted=false" to the match profile?
09:28	RMiller	I'm importing about 5000 records with their items so I reeeaaally don't want to undelete them one by one
09:29	Dyrcona	No, I don't think that will work.
09:29	Dyrcona	Well, if you have database access you can undelete them in one go with a query.
09:29	RMiller	I'm going to need a heck of a lot of handholding to fix it in the code :/
09:30	Dyrcona	Let me look.
09:30	Dyrcona	RMiller: What release are you on? That might make a difference.
09:30	RMiller	I think we deleted them with one query because our first attempts at getting Vandelay to do its thing led to a bunch of multiples
09:30	RMiller	duplicate records, that is.
09:31	RMiller	3.2
09:33		tlittle joined #evergreen
09:34		yboston joined #evergreen
09:35		sandbergja joined #evergreen
09:40	Dyrcona	RMiller: I'm not sure where the actual code is that finds matches, but this definitely sounds like a bug. I say add something on Launchpad.
09:44	RMiller	Ok, will do. Thanks :)
09:50		Christineb joined #evergreen
10:18	Dyrcona	dbwells++ Bmagic++ sandbergja++
10:18	Dyrcona	Working on releases.
10:19	Bmagic	Drycona++ # Rockin the Evergreen casba
10:24		beanjammin joined #evergreen
10:46		khuckins joined #evergreen
11:13		jvwoolf joined #evergreen
11:15		yboston joined #evergreen
11:25	miker	berick: I have something I'd like to run by you ... I notice after pushing the angular7 server admin page into master (and linking it from the nav bar) it traps you in the eg2 area. I think until more of the ang7 stuff is in place maybe the navbar should link to the ang-js stuff (catalog search, "home" link, etc). or, should we not link to the eg2 server page and just link to parts of it? (the navbar for those parts can still trap you, of course)
11:28	berick	miker: the ang7 navbar should be sending you back to angjs any time you click an angjs-driven page
11:28	berick	catalog search search link is a bug
11:29	miker	well, the "home" link at the top left is sending you to the eg2 splash page. though if the catalog search link under Search is just a bug, that may not be a big issue
11:30	berick	yeah, i could see the home link going back to angjs as well.
11:30	berick	bug 1813646
11:30	pinesol	Launchpad bug 1813646 in Evergreen "search catalog flips between UIs" [Undecided,Confirmed] https://launchpad.net/bugs/1813646
11:32	miker	cool. so, that's a super tiny fix. objection to me just pushing that (and the home link) change?
11:32	berick	no objections
11:36		beanjammin joined #evergreen
11:46	csharp	release_team++ # hope to re-join the effort soon
11:52		aabbee joined #evergreen
12:20		jihpringle joined #evergreen
12:23		sandbergja joined #evergreen
12:23		_sandbergja joined #evergreen
12:46		Dyrcona joined #evergreen
13:17		jvwoolf1 joined #evergreen
13:51		yboston joined #evergreen
14:18		bdljohn1 joined #evergreen
14:20		mmorgan1 joined #evergreen
14:26		yboston joined #evergreen
14:42	Bmagic	Anyone have the magic sms gateway for spectrum mobile?
14:56	Dyrcona	Nope. Not even the non-magical one.
14:56	Bmagic	:)
15:02	Dyrcona	Can't find it online either, though it looks like Spectum Mobile is Charter Cable is/was Time Warner Cable.
15:03	Dyrcona	I did find this handy list: https://kb.sandisk.com/app/answers/detail/a_id/17056/~/list-of-mobile-carrier-gateway-addresses
15:10	rhamby	Bmagic: seems like a longshot but Spectrum uses the Verizon network so you can try the Verizon one if Spectrum's isn't available or see how good thier twitter support is @Ask_Spectrum :)
15:11	Bmagic	rhamby: thanks! I basically suggested that. "Try all of them until one works"
15:11	Bmagic	or none works...
15:18		sandbergja_ joined #evergreen
15:19	jeff	And this is why I'm happy to pay per text message... :-)
15:27	csharp	yeah, we were having this conversation earlier
15:27	csharp	my pat answer for staff trying to get us to add carriers is a flat no
15:27	jeff	Oh
15:28	jeff	er, "Oh?"
15:28	csharp	unless the carrier has documentation on their site about setting it up
15:28	jeff	Ah. nod
15:28	jeff	Less flat.
15:28	csharp	we've chased our tail more than once trying to add carriers for some of these off-brand cellular companies (and some more well-known ones)
15:29	jeff	More "not without official documentation from the carrier in question" and perhaps even "a patron willin to test" :-)
15:29	jeff	Which in some cases reduces/simplifies to "No."
15:30	csharp	I guess that's the other nuance that de-flattens my "no" - the patron is welcome to request that information of the carrier :-)
15:31	jeff	I'm pretty sure I've made my strong feelings about this known, but in case anyone's interested in trying the "eliminate the email to text gateways" approach, I'd be interested in helping / collaborating. :-)
15:31	jeff	I think that Stompro is on that short list.
15:31	csharp	I'm all for it and would be willing to assist however I can
15:32	Stompro	jeff, yep, we no longer use it, we use the flowroute api.
15:32	* csharp	's ears perk up (much like his new family dog's)
15:33	jeff	yeah, and we've always used Twilio.
15:33	jeff	though we looked at flowroute, plivo, tropo, and possibly some others.
15:33	jeff	csharp: do you have a sense of your text volume?
15:33	csharp	jeff: we can easily get numbers
15:34	csharp	we should definitely look into that
15:36	csharp	looks like we sent about 26K SMS notices yesterday - probably a typical load for a weekday
15:39	rhamby	Bmagic: I was curious so I asked @ask_spectrum and according to their twitter person they do not (take that for what it's worth, which is I suppose a tweet and 15 seconds of an intern's time)
15:42	Bmagic	rhamby: thanks!
15:42	Bmagic	csharp jeff: I've considered twilio. I'm excited to hear that someone has already hacked evergreen to interact with their API! Is there a bug somewhere... ?
15:42	jeff	Bmagic: stompro has a flowroute bug and branch, i think
15:43	jeff	we took a different approach.
15:43	Bmagic	cool
15:43	jeff	issue gets tricky when you get up to volume, though.
15:43	Bmagic	issues with their throttle?
15:44	Bmagic	API requests / minute or something like that?
15:44	jeff	csharp: good to know an approximation. let me think about that a bit...
15:46	jeff	csharp: can you tell how many unique patrons or how many unique numbers?
15:49	jeff	Bmagic: that and other things. you can send about a message per second as long as you're paying for it. higher volume starts to get into issues not only of cost, but also carriers blocking / requiring you to use a shortcode, pay carrier-specific additional rates, etc.
15:50	Bmagic	I see
15:50	jeff	several paragraphs that i'm not going to type with thumbs right now... :-)
15:51	Bmagic	No worries! Just curious. I don't think our consortium will want to pony up the money, but in case they do....
15:55	jeff	already at csharp's stated volume you would be looking at about $195/day.
15:56	jeff	but i suspect you would quickly find many of those are not actually deliverable... which could reduce your costs. :-)
15:57	jeff	and the number that actually do get delivered would probably go up quite a bit, but you wouldn't have hard numbers to compare since your current deliverability percentage is approximately undef.
16:08	jeff	csharp: how much grouping of messages do you currently do? if someone has seven things hit the hold shelf at a library in a day, do they always get seven texts, or only if the items are "far enough apart", time-wise, or do they only ever get one text per patron per lib per day?
16:08	jeff	(that last part might not be possible with stock A/T)
16:17	Dyrcona	A/T can group messages, but within the delay interval only, I think.
16:18	Dyrcona	With sms you're also dealing with a size limit, but I suppose larger messages are split up by the gateway.
16:19	jeff	depends on the gateway, but with Twilio they're split across multiple messages (and you pay for each), so we took steps (for cost and for "don't annoy the patron" reasons) to ensure that we send a very small number of messages.
16:22	Dyrcona	Most gateways are also automatically MMS, too, so if the user doesn't have data, "No TXT for you!" ;)
16:24	csharp	ansible++
16:25	csharp	jeff: I don't remember right this moment, but I'm pretty sure we group them so it's not an insane number of texts per patron
16:31		yboston joined #evergreen
16:40		makohund joined #evergreen
16:47	makohund	A while back I mentioned wanting to try out evergreen on debian buster. (Was upgrading our test server from jessie to stretch, and kept going.) And I'm happy to report that it is done, and working just fine. :)
16:48	bshum	Buster? Lol, awesome
16:48	Dyrcona	makohund: Do you have a git branch to update the prerequisite installers?
16:49	Bmagic	makohund++
16:49	makohund	Nope, and TBH I've barely ever done anything with git or any other version management tools.
16:50	gmcharlt	we'll happily git-ify any changes/patches you have :)
16:50	makohund	Something that floats around on my list of "should learn when find the time". And keeps on floating, perpetually. :)
16:51	makohund	It really didn't take all that much, I could just summarize it here
16:51	gmcharlt	groovy
16:51	bshum	makohund++
16:52	makohund	Biggest prob was ejabberd... change to default internal hostname. Eventually just reinstalled & configured it from scratch.
16:53		khuckins joined #evergreen
16:59	makohund	Found the message about it... the release note for 17.08-1... https://github.com/jabber-at/ejabberd/blob/master/debian/NEWS
17:02	pinesol	News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>
17:05		sandbergja_ joined #evergreen
17:05	makohund	Instead of the ejabberd config notes for stretch, follow the ones for ubuntu bionic. Except can't just uncomment mod_legacy_auth, have to add it.
17:08		mmorgan left #evergreen
17:13	Dyrcona	We need to get away from using legacy auth and tech OpenSRF to do SASL, but time....
17:14	Dyrcona	Anyway, that's it for me today.
17:17	makohund	Changes to su gave me some trouble... scripts failing to find a2enmod. Going with sudo or sudo su - (instead of sudo su) took care of that.
17:19	makohund	For prerequisites for opensrf & evergreen, I just copied the Makefile.install (and related parts) for stretch to make new ones for buster. Only two changes needed...
17:20	makohund	...remove libparent-perl and apache2-prefork-dev from the list of packages to be installed.
17:21		sandbergja_ joined #evergreen
17:21	makohund	And that's pretty much it.
17:22	makohund	Hi Jane, when's the last time you caught me on here? :)
17:25	sandbergja_	makohund: oh hi!
17:26	makohund	For ejabberd, that should be the default ERLANG_NODE node name, not the hostname. Changed to ejabberdlocalhost. Trying to follow instructions to change it didn't go so well, so I eventually nuked it and redid it.
17:26	berick	Bmagic: huzzah, after almost a day of futzing with hatch/java I have Hatch working with the Dymo. tracked it down to a bug in javafx printing that's fixed in a later version.
17:26	Bmagic	oh!
17:26	Bmagic	the "default" printer bug?
17:26	berick	Bmagic: i wanted to see if we could avoid the big fork in the code
17:27	makohund	sandbergja_: I'm finally done monkeying around with the test server
17:27	berick	Bmagic: https://bugs.openjdk.java.net/browse/JDK-8088918
17:27	Bmagic	berick: so the work around using JEditorPane isn't needed?
17:28	berick	the fix was a side effect of this bug, a toss-away comment basically that got some love
17:28	berick	Bmagic: right, we don't need the jeditor pane stuff at all
17:28	berick	no hatch changes are required
17:28	Bmagic	oh wow! That's better!
17:28	Bmagic	lol
17:28	Bmagic	hate to think it was time wasted on my part but, I'm not worried about that
17:29	makohund	sandbergja_: Upgrade on it is done. And the OS too... all the way up to debian buster.
17:29	berick	now the harder part is the installation .. requires jdk 11. but the good news is, jdk and jfx are both aviable under GPL for windows as zip bundles
17:29	Bmagic	if we ever ever ever need Java to manually parse CSS and use JEditorPane - we've got the friggin code
17:30	berick	Bmagic: that was part of my concern, thinking it might be brittle over time
17:30	* berick	will document findings
17:30	Bmagic	berick++
17:31	berick	I have a tine return address label here now that just says "test"
17:31	berick	s/tine/tiny/
17:31	Bmagic	oh man! The feeling of seeing that printer spit anything out for the first time is euphoric
17:32	Bmagic	RE: brittle over time - I did write in my comments "well, crap"
17:32	Bmagic	:)
17:33	berick	oh, i know, you did what had to be done
17:33	berick	nothing wrong w/ that
17:33	Bmagic	I never thought I would get butterflies when seeing something print. LOL.
17:34	Bmagic	My coworkers's jaws would drop because everyone around me knows how much I despise printers in general. So much so, I don't setup autopay with my banks because I know that they just print and mail a check. NO MORE PAPER.
17:37	sandbergja_	makohund++
18:00		sandbergja_ joined #evergreen
18:30	sandbergja_	berick: is there a reason why the eg2 Vandelay is in app/staff/cat, and the staff catalog is in app/staff/catalog
18:31	sandbergja_	and not in the same directory?
18:54	makohund	\q
19:43		jvwoolf joined #evergreen
21:57		sandbergja joined #evergreen
22:16		ningalls_ joined #evergreen
22:53		sandbergja joined #evergreen