Time |
Nick |
Message |
05:22 |
pinesol_green |
Incoming from qatests: Test Success - http://testing.evergreen-ils.org/~live/test.html <http://testing.evergreen-ils.org/~live/test.html> |
06:23 |
|
wsmoak joined #evergreen |
06:23 |
|
wsmoak joined #evergreen |
10:22 |
|
Dyrcona joined #evergreen |
10:23 |
Dyrcona |
Nine hours of Session Error: opensrfprivate.mvlcstaff.org/open-ils.cstore_drone_evergreen_1409806682.243033_21093 IS NOT CONNECTED TO THE NETWORK!!! in the logs. |
10:23 |
Dyrcona |
That just doesn't seem right. |
10:23 |
Dyrcona |
lp 1359762 |
10:23 |
pinesol_green |
Launchpad bug 1359762 in Evergreen "Internal Server Error Doing Keyword Basic Search for ISBN" (affected: 1, heat: 6) [Undecided,New] https://launchpad.net/bugs/1359762 |
10:25 |
Dyrcona |
And, it is happening right now. |
10:26 |
Dyrcona |
And the drone is good and dead, nothing running with the PID! |
10:26 |
Dyrcona |
[2014-09-06 10:22:48] /usr/sbin/apache2 [ERR :24799:EX.pm:66:1409976904247991950] Exception: OpenSRF::EX::Session 2014-09-06T10:22:48 OpenILS::Utils::CStoreEditor /usr/local/share/perl/5.14.2/OpenILS/Utils/CStoreEditor.pm:453 Session Error: opensrfprivate.mvlcstaff.org/open-ils.cstore_drone_evergreen_1409986944.186285_27116 IS NOT CONNECTED TO THE NETWORK!!! |
10:26 |
* Dyrcona |
goes to update the bug again. |
10:31 |
Dyrcona |
I'm seeing that with another PID, too. |
10:32 |
* Dyrcona |
is covering Saturday and even went into the office since I would have to come in by 6:00 PM to fiddle with the UPS, again. |
10:32 |
Dyrcona |
@monolog |
10:32 |
pinesol_green |
Dyrcona: I see nothing, I know nothing! |
10:32 |
Dyrcona |
@monologue |
10:32 |
pinesol_green |
Dyrcona: Your current monologue is at least 11 lines long. |
10:32 |
Dyrcona |
Well, spelling counts. |
10:35 |
Dyrcona |
Hmm... Could it be connected sessions that were not properly disconnected? |
10:36 |
Dyrcona |
The fact that is is a "Session Error" makes me wonder. |
10:36 |
Dyrcona |
But, then, why would it happen with Apache? |
10:37 |
Dyrcona |
Because the sessions were created in Apache and not closed because the editor normally would go out of scope and be destroyed, thus closing the session. |
10:38 |
Dyrcona |
But, in Apache, the editor hangs around as long as the Apache child runs, and so the connected session never gets disconnected... |
10:39 |
Dyrcona |
So, why is it only us who notices? |
10:39 |
Dyrcona |
Different Apache settings for max children, number of requests processed, etc.? |
10:41 |
Dyrcona |
Maybe it's a cstore setting and our drones are being cycled more rapidly than others' drones are? |
10:42 |
Dyrcona |
If I did this on a street corner, people would think me crazy. ;) |
10:42 |
jeff |
heh |
10:42 |
jeff |
here, it's appreciated. :-) |
10:43 |
Dyrcona |
:) |
10:43 |
Dyrcona |
Well, I have a new hypothesis to test. |
10:46 |
Dyrcona |
The errors all see to reference either line 450 or 453 of CStoreEditor, that gives me a place to look. |
10:48 |
Dyrcona |
The errors are coming from CStoreEditor::request which uses the internal session to do a request. |
10:48 |
Dyrcona |
It bombs if the connection needs to be connected but isn't. However, there's no check if the connection is still valid. |
10:49 |
Dyrcona |
Line 450 is in the catch block after the request is made and logs the error to osrfsys.log. |
10:49 |
Dyrcona |
Line 453 is where the error is rethrown. |
10:50 |
Dyrcona |
Thus, why those two line numbers appear. |
10:50 |
Dyrcona |
So, we need to check in request if the connection is still valid, or make sure the session is disconnected when a transaction is committed or rolled back, and have request connect when necessary. |
10:56 |
Dyrcona |
Maybe I am getting ahead of things here. |
10:57 |
Dyrcona |
What is more likely is something in Apache, i.e. probably in the OPAC, is starting a transaction but never closing it. |
10:57 |
Dyrcona |
This, I think, makes a good argument for getting CStoreEditor and the like out of the OPAC. |
11:07 |
Dyrcona |
For some reason, Brad Turcotte (http://www.bradsucks.net/) seems to me to write the perfect music for debugging. |
11:11 |
Dyrcona |
I think I'll let my ideas on the cstore sessions sit for a bit and work on something else for a while. |
11:46 |
|
wsmoak joined #evergreen |
11:46 |
|
wsmoak joined #evergreen |
15:29 |
|
jeff_ joined #evergreen |
15:29 |
|
pmurray joined #evergreen |
15:29 |
|
mtcarlson_away joined #evergreen |
15:34 |
|
wsmoak joined #evergreen |
15:34 |
|
wsmoak joined #evergreen |
16:56 |
Dyrcona |
I hear thunder. |
16:56 |
Dyrcona |
I think I'm gonna regret not having brought a jacket with me. |
17:05 |
pinesol_green |
Incoming from qatests: Test Success - http://testing.evergreen-ils.org/~live/test.html <http://testing.evergreen-ils.org/~live/test.html> |
17:33 |
|
sarabee_ joined #evergreen |
17:35 |
eeevil |
@later tell Dyrcona request does check for connected-ness, to the degree it can. is it possible that there's enough client-side work going on that the connection timeout (usually 6s for cstore, IIRC) is being hit, and the cstore backend gives up waiting? doesn't directly explain NOT CONNECTEDs, but if they get recycled enough due to timeout, and you have a low max_requests, that could show up then |
17:35 |
pinesol_green |
eeevil: The operation succeeded. |
17:40 |
|
tsbere joined #evergreen |
19:22 |
|
sarabee joined #evergreen |
20:02 |
|
sarabee_ joined #evergreen |