2

OK, this is a really weird one and I'm not even sure how to properly describe it. We had a customer complain that a specific page on our website wasn't working, and one of our internal technicians was able to reproduce the issue as well. Most of the website is working fine. This is deployed on Azure App Service.

I checked running the exact same page as the technician, and it worked fine for me. The entire request is identical, except for authentication cookies. When I run the request, I get 200 OK, but the technician and the customer get 404 NOT FOUND.

The issue only started after we did a VIP swap this morning on Azure App Service (which I am new to). I deployed a service update this morning to the Staging Deployment Swap, then a few minutes later did the VIP Swap. I think that both the customer and the technician had their browser open and session active during the VIP Swap.

I've done some troubleshooting, and here is what I discovered. I can use Fiddler to capture the exact trace for the web page that works fine for me. Then I can copy only one value from the request for the technician that gets 404 Error, and suddenly I can reproduce the 404 error as well. The difference is one cookie:

Cookie: ARRAffinity=blahblahblahblah;

My basic understanding is that this is a key for identifying which server the user is connecting to so they get affinity to a specific instance in the load-balanced set (2 servers). We were able to fix the issue by having the technician and customer delete all cookies in their browser, but even logging out and back in wouldn't fix the issue.

Why would a "stale" affinity key cause a random 404 on one specific page? Is it possible that some of the user's requests are actually getting directed to the old staging deployment site, even though they are hitting the url that connects to the Production deployment site?

mellamokb
  • 133

1 Answers1

2

There are 2 things here:

  1. The session affinity. As you may read in this article, you can now remove session affinity in web apps, if this deserves your use case (ex. you handle sessions outside the web app or simply you don't have session specific info).
  2. The 404 error is a bit strange one. It may be from a faulted deployment, so you may want to redo a full deployment on a new slot and swap it again. If still have errors, take a look at the web site itself and see if there isn't any "stateful" code which would give you specific behavior.

Please let us know what happened at the end.