Conversation

Notices

  1. I'm seeing those "bad SHA-1 HMAC" errors at around 1100-1700 notices in each of my !gnusocial rotated logs (default logrotate settings).  A bit disconcerting.

    Sunday, 22-May-16 16:57:03 UTC from gs.kawa-kun.com
    1. @takemangoakenji Bad hashes  for the hubsub/feedsub secrets, maybe?

      Don't understand the need for secrets for a subscription personally.

      Sunday, 22-May-16 17:01:23 UTC from community.highlandarrow.com
      1. @maiyannah Something like 6600 bad HMAC errors over the past two months.

        Sunday, 22-May-16 17:02:48 UTC from gs.kawa-kun.com
        1. @takepapayaakenji Well I just mean to say, that would be my first inclination as to why those errors are occuring.

          Sunday, 22-May-16 17:03:59 UTC from community.highlandarrow.com
          1. @maiyannah @mmn @moonman If you're curious about the worst offenders, I put it here: http://u.daggsy.com/2V

            There aren't any secrets of HMACs in there; just the users, some event IDs, and partial notice content.  Yes, it's DropBox, but 1.2 MB uncompressed seemed a bit much for a Pastebin analog.

            Sunday, 22-May-16 17:06:30 UTC from gs.kawa-kun.com
            1. @takegrapeakenji @moonman @mmn I probably wouldn't mind dropbox as much if they didn't keep nagging me to get an account.

              Sunday, 22-May-16 17:07:45 UTC from community.highlandarrow.com
            2. @mmn @moonman @maiyannah Some instances do odd things with the <id> field of the AtomPub, but I didn't want to spend too much time figuring out the perfect way of getting a useful URL.

              Sunday, 22-May-16 17:07:58 UTC from gs.kawa-kun.com
              1. @takepapayaakenji @moonman @mmn I notice it seems like a few problem instances, so I still feel this is communications related.  Have you had any further problems related to SPC since your fix last night?

                Sunday, 22-May-16 17:10:42 UTC from community.highlandarrow.com
                1. @maiyannah @mmn @moonman freezepeach.xyz seemed to be the worst as far as that subscription issue, but I'm definitely seeing stuff from users I hadn't seen before.

                  RDN seems to be the the most obviously bad one with respect to the bad HMAC errors.  It's not alone, though.

                  Sunday, 22-May-16 17:13:49 UTC from gs.kawa-kun.com
                  1. @takecherryakenji @moonman @mmn I will investigate this further when I manage to drag my ass out of bed and eat, been a real bad day for arthritis.  Keep me in the loop though so I know if you fix it in the meantime, thanks :)

                    Sunday, 22-May-16 17:15:25 UTC from community.highlandarrow.com
                2. @takegrapeakenji How many of those messages are ones you actually missed?  Pretty sure you replied to at least a few of the ones from my instance.

                  Sunday, 22-May-16 17:13:54 UTC from community.highlandarrow.com
                  1. @maiyannah I know I've been missing stuff from RDN, and some from Quitter\.se.  I don't see any of the bad HMAC errors from your instance in my logs since I started my own instance.

                    Sunday, 22-May-16 17:15:30 UTC from gs.kawa-kun.com
                    1. @takeFrankerZakenji There's a few in that zipped log, but I know at least some of those messages you RTd/favd/replied to

                      Sunday, 22-May-16 17:19:55 UTC from community.highlandarrow.com
                      1. @maiyannah Are they notices that are replies to me?

                        Sunday, 22-May-16 17:21:44 UTC from gs.kawa-kun.com
                      2. @takeFrankerZakenji As far as I can see HMAC is used in the API for the secrets exchanged between servers so it appears from a cursory glance what is occuring in this case is the handshake between servers is failing.

                        In /plugins/ostatus/classes/hubsub.php:

                        if ($this->secret) {
                           $hmac = hash_hmac('sha1', $atom, $this->secret);
                           $headers[] = "X-Hub-Signature: sha1=$hmac";
                        } else {
                           $hmac = '(none)';
                        }

                        similar in feedsub

                        So what is probably happening is the HTTP header X-Hub-Signature is not what the receiving instance (yours) expected it to be.

                        Sunday, 22-May-16 17:26:12 UTC from community.highlandarrow.com
                        1. @takepapayaakenji By the way in looking at this I see in feedsub where it is setting a failed subscription to inactive so I made note of that for if I can try to add a retry mechanism.

                          Sunday, 22-May-16 17:29:02 UTC from community.highlandarrow.com
                          1. @takePotato Knishesakenji The code that actually generates the error you are seeing is in /plugins/ostatus/classes/feedsub.php around 489 

                                protected function validatePushSig($post, $hmac)

                            It is comparing their given 'secret' hash they sent in that HTTP header request, to what is stored locally in the database.  That exception is thrown if they do not match.

                            As to why they don't match, it's likely because the subscription did not complete successful.  I have a theory this field is blank, maybe check RDN subscriptions for an affected RDN user and see if it has a 'secret' in the table?  I'd give you a query to try but I'm not at my computer :/   But I suspect that since the subscription didn't complete, the subscribing instance didn't get the secret hash, and as a result, they don't have it to present at negotiation time for the subscription pushes.

                            Sunday, 22-May-16 17:32:41 UTC from community.highlandarrow.com
                            1. @takebananaakenji As far as I can see, there is no check to ensure it's not blank or NULL so its very possible this is the problem.

                              Sunday, 22-May-16 17:34:49 UTC from community.highlandarrow.com
    2. @takeFrankerZakenji It occurred to me: the subscriptions to feed are refreshed on a certain interval, to regenerate the secret, so what could be happening too is the secrets are getting out of sync as a result.

      Sunday, 22-May-16 18:25:39 UTC from community.highlandarrow.com
      1. @maiyannah Speaking of refreshing, it looks like active subscriptions go inactive when the relevant instance is unavailable.  This time, it's quitter\.no.

        Sunday, 22-May-16 18:32:56 UTC from gs.kawa-kun.com
        1. @takekiwiakenji Which then rejects pushes when they happen because it's inactive...

          I think the nature of our problem is starting to become clear.

          Sunday, 22-May-16 18:46:57 UTC from community.highlandarrow.com
          1. @maiyannah So we have bad secrets, empty secrets, and inactive subscriptions.  Any other things I should scan for in feedsubs?

            Sunday, 22-May-16 18:58:27 UTC from gs.kawa-kun.com
            1. @takegrapeakenji What I would pay attention to is a feedsub for an instance you've had issues with, which is about to expire at some time you can watch the regeneration.  It would confirm/deny that bit.

              It basically looks like we have to communication errors that cause the same problem:

              1] Not properly communicating at time of subscription, and

              2] Not properly communicating when the subscription comes up for renewal,

              both of these are fault-intolerant and have no retry mechanism, and when they fail, result in you not having the hash to decrypt the encoded atom feed that gets sent.

              The inactive feed status is symptomatic more than cause, from what I can see, but nonetheless related.

              Sunday, 22-May-16 19:07:13 UTC from community.highlandarrow.com
              1. @maiyannah Some sort of retry and verification mechanism is needed, eh?  Wouldn't that need a fix in whatever 'open' protocol is used for that?  Diaspora and so on need the fix, too.

                Sunday, 22-May-16 19:08:51 UTC from gs.kawa-kun.com
                1. @takePotato Knishesakenji ostatus has a retry mechanism for these things, it's just disabled by default.

                  But this is gnuSocial (and other program's) implementation.

                  Rather than handling the exception by saying "maybe we should try to resolve this issue" it just discards the communication.  It handles errors by logging them and then doing precisely nothing, which makes it fault intolerant.  TCP is inherently prone to losing packets.  Any framework or standard that uses it, needs to take this into account or it is flawed by design.  In this case, I would argue that gnuSocial not attempting to mitigate packet loss or other miscommunication layers is a design flaw. 

                  Moreover, it doesn't tell you it's doing this really, as evidenced by the fact that we had to go digging to figure this out.

                  Sunday, 22-May-16 19:18:49 UTC from community.highlandarrow.com
                  1. @takekiwiakenji At the very least I would propose having a seperate 'feedsub' status than inactive for a subscription that failed because of an error.  gnuSocial is putting a feed in a non-error state when an error occurs.

                    Sunday, 22-May-16 19:19:56 UTC from community.highlandarrow.com
                    1. @takebatcaveakenji In fact, thinking on it for a moment now, I think this would be the ideal solution.  If you put it into an error state, you can then have the retry mechanism put in if the sub is in an error state (default it to 0 retries like normal ostatus stuff if it bothers you that much) and you could just toss an event hook in there for people who want to handle the error state a different way.  This makes it clear the stream had an error, adds optional fault tolerance, and makes it so you can have extended reactive code if you desire, with what I think would be relatively minimal changes to the underlying software.

                      Sunday, 22-May-16 19:23:05 UTC from community.highlandarrow.com
                      1. @maiyannah My only concern is about how slow upstream is when accepting changes.

                        Sunday, 22-May-16 19:23:40 UTC from gs.kawa-kun.com
                        1. @takecherryakenji Well, the wonders of open source free software is that I could just put the modified files up somewheres or email it to people and they can apply it themselves.

                          And if it ever gets terribad you can always fork it.

                          Sunday, 22-May-16 19:25:45 UTC from community.highlandarrow.com
                      2. @takePotato Knishesakenji I will add this to my monumental gnusocial todo list and see if I cant do it myself next time I'm kind of on the groove/mindset to work on code.

                        Sunday, 22-May-16 19:24:52 UTC from community.highlandarrow.com
    3. I'm seeing those "bad SHA-1 HMAC" errors at around 1100-1700 notices in each of my !gnusocial rotated logs (default logrotate settings).  A bit disconcerting.

      Monday, 23-May-16 00:00:17 UTC from gs.kawa-kun.com
    4. @takeFrankerZakenji Before I upgraded from # to #, I used to use the "two-step" http://url.federati.net/n2GDo ... but it is untested with GS. May or may not work. May or may not damage your instance.

      Monday, 23-May-16 00:33:21 UTC from fresh.federati.net