Password Rotation Examined Through Probability
Contents
Got into an argument at work over whether to build a periodic password rotation feature. The other side says “it’s safer if users can rotate regularly.” I think it’s probabilistically almost useless.
Of course, not changing a leaked password is bad. But that’s “change it because there’s evidence of compromise,” not “change it because 90 days have passed.”
NIST SP 800-63B takes the position that verifiers should not require periodic password changes, but should force a change when there’s evidence of authenticator compromise. Microsoft 365’s admin guidance also recommends setting cloud-only accounts to never expire passwords. The reasoning is practical: the more rules you pile on, the weaker the passwords people choose.
The Window Where Rotation Actually Helps Is Narrow
Periodic password rotation prevents roughly this one scenario.
flowchart TD
A[Password is leaked] --> B[Not yet detected]
B --> C[Attacker hasn't used it yet]
C --> D[Rotation date arrives]
D --> E[Old password is invalidated]
In other words, rotation only saves you when the attacker has obtained the password but hasn’t logged in yet, and the user happens to change it in time.
In the following scenarios, rotation doesn’t help much.
| Scenario | Effect of rotation |
|---|---|
| Attacker logs in right after phishing | They’re in before rotation day |
| Password is reused across services | Leaked via another service anyway |
| Device is infected with malware | New password gets captured too |
| Session cookie is stolen | Entry via a different vector entirely |
| Leak is already detected | Immediate change needed, not scheduled |
Saying “rotation is pointless” is a bit sloppy. It has value. But the window where it works is narrow: the password has leaked, it hasn’t been used yet, and it hasn’t been detected either.
The probability that rotation invalidates a password before the attacker uses it is a product of three conditions.
Suppose the probability of password leakage per 90-day cycle is 1%, the probability the leak goes undetected is 50%, and the probability the attacker doesn’t use it before rotation day is 10%.
That’s 0.05% per account per 90-day cycle. Double every assumption and you still only get 0.1%. The order of magnitude doesn’t change.
For an organization with 10,000 accounts running 4 rotations per year, the expected number of accounts saved by rotation is:
20 per year. Not zero, but considering the helpdesk calls and lockouts that come with password resets, whether it’s worth it organizationally is questionable.
With 90-day rotation, the average time from leak to invalidation is 45 days. With 30-day rotation, it’s 15 days. Looking at these numbers alone, shorter seems stronger.
But if attackers use stolen passwords within minutes to hours, shortening from 45 to 15 days barely makes a difference. It works against “old leaked credential lists that might be used someday” but not against “steal now, use now” attacks.
Google’s research shows that phished credentials are used for login attempts almost immediately. If rotation cycles are measured in 90 or 30 days, they’re 2-3 orders of magnitude off from attacker speed.
User Behavior Degradation Belongs in the Probability Too
What’s often missing from the rotation debate is the probability that forced changes make users weaker.
People don’t come up with entirely new strong passwords every 90 days. A significant fraction will increment a trailing number, insert a season name, recycle old patterns, write it on a sticky note, or reuse the same password on other services.
Research from UNC Chapel Hill (Zhang et al., 2010) found that in an environment with forced periodic changes, 41% of new passwords could be cracked from the previous password within 3 seconds. Most transformation patterns were mechanical: incrementing a trailing number, shifting uppercase positions, swapping symbols. If an attacker knows the rules, the search space shrinks dramatically. In formula terms, the conditional probability of an attacker guessing the new password given the old one is (offline, within 3 seconds). Even in online attacks, within 5 attempts. Even if rotation invalidates the old password, an attacker holding it still breaks through 40% of the time.
Microsoft’s guidance also states that rules requiring password changes tend to weaken password quality. NIST’s shift to “don’t force periodic changes” isn’t just about being nice to users. Forced rotation creates patterns that attackers can predict.
From a probability perspective, the comparison looks like this.
| Policy | What increases | What decreases |
|---|---|---|
| With periodic rotation | Weak derivative passwords, support tickets, lockouts, reuse | Validity period of undetected leaks |
| Without periodic rotation | Duration old passwords remain valid | Change fatigue, lazy derivatives, written-down passwords |
If you’re going to implement rotation, the benefit of “shortening the validity period of undetected leaks” needs to exceed the cost of “users drifting toward weaker practices.” Implementing it without estimating this tradeoff—just because “it seems safer”—is a risky spec decision.
When Passwords Should Be Changed
Periodic rotation is thin, but the password change feature itself is needed.
What’s needed is event-driven changes, not calendar-driven ones.
| Trigger | Response |
|---|---|
| Service-side breach | Force change for affected users |
| Match against known-breached password lists | Reject at login or change time |
| Suspicious login | Additional authentication, notification, change request if needed |
| User self-report | Invalidate existing sessions and change |
| Admin account recovery | Reset flow, not a temporary password |
The OWASP Authentication Cheat Sheet also calls for credential rotation when breaches or compromises are confirmed, while discouraging periodic change requirements. This is pretty consistent across the board.
Better Expected Value by Investing Beyond Passwords
For the same engineering effort, there are things to do before building a rotation screen.
Allow sufficient password length. Block common and breached passwords. Rate-limit login attempts. Don’t interfere with password managers. Implement MFA (multi-factor authentication). Notify on suspicious logins. Decide how to handle existing sessions on password change.
I’ve previously written about implementing TOTP authentication in your own service. TOTP itself isn’t a phishing-resistant method, but it’s significantly better than passwords alone. Microsoft’s statistics show that over 99.9% of automated attacks against MFA-enabled accounts are blocked. Compared to rotation’s contribution at the 0.05% order, MFA is over 3 orders of magnitude more efficient for the same effort. For even stronger protection, move toward phishing-resistant methods like passkeys or WebAuthn.
This kind of spec isn’t about building an unbreakable system. It’s about where you place attack costs and user burden. As I wrote in an article on voting system design, even identity verification can’t reduce fraud to zero. Authentication is the same: you can only decide what probability of residual risk you’ll accept.
2FA and 2SA (Two-Step Authentication) Are Different Things
When MFA comes up, I often see 2FA (two-factor authentication) and 2SA (two-step authentication) treated as the same thing. In practice, these offer different levels of defense.
2FA means two different types of authentication factors are used. Authentication factors fall into three categories.
| Factor | Description | Examples |
|---|---|---|
| Knowledge (Something you know) | Information only the user knows | Password, PIN |
| Possession (Something you have) | A physical item the user has | Smartphone, security key, smart card |
| Biometric (Something you are) | A physical characteristic of the user | Fingerprint, face, iris |
2FA combines two different types from these categories. With password (knowledge) + TOTP app (possession), a leaked password alone isn’t enough. The attacker also needs access to the phone.
2SA, on the other hand, means “there are two authentication steps,” but not necessarily two different factor types. For example, entering a password and then receiving an SMS confirmation code looks like two steps but has a subtlety. If the SMS is intercepted via a SIM swap attack (where the attacker takes over the phone number), the possession factor is weak. NIST SP 800-63B also classifies SMS-based OTP as a “restricted authenticator” with lower priority.
Ranked by realistic strength, it looks like this.
flowchart LR
A["Password only"] --> B["Password +<br/>SMS OTP"]
B --> C["Password +<br/>TOTP app"]
C --> D["Password +<br/>Security key"]
D --> E["Passkey<br/>(passwordless)"]
Password + SMS OTP is a common 2SA configuration, but as 2FA the SMS possession factor can be compromised through SIM swapping (hijacking the phone number) or SS7 interception (exploiting vulnerabilities in the telephone signaling protocol to intercept SMS). With TOTP apps or security keys, the factors stay independent unless the attacker physically takes the device.
When someone says “add two-step authentication” in a spec discussion, evaluate based on factor independence, not step count. Calling SMS OTP “2FA” and feeling secure is the most dangerous outcome.
Character-Class Requirements Also Erode Entropy
“Must contain uppercase, lowercase, numbers, and symbols” is another common requirement. In theory, more character types expand the search space. In practice, user behavior says otherwise.
Choosing an 8-character password completely at random from 95 printable ASCII characters gives combinations, about 52.6 bits of entropy. Adding a constraint requiring all 4 character types doesn’t narrow the theoretical space much if generation is truly random.
But most users forced to include all character types capitalize the first letter of an English word, append 2 digits, and add ! or @ at the end.
The effective search space becomes: dictionary word count × digit combinations × symbol choices.
With 20,000 common words, 100 two-digit suffixes (00-99), and 33 symbols:
That’s about 26 bits of entropy. Weaker by over 11 bits than random lowercase 8-character passwords at (about 37.6 bits). times the difference.
Character-class requirements maintain the theoretical space but bias the distribution humans actually choose, lowering effective entropy. NIST SP 800-63B explicitly states that composition rules should not be imposed.
To reliably increase password strength, raising the minimum length is more effective than requiring character types.
| Condition | Combinations | Entropy |
|---|---|---|
| 4-type forced 8-char (patterned) | ~26 bits | |
| Lowercase-only random 8-char | ~37.6 bits | |
| Lowercase-only random 12-char | ~56.4 bits | |
| Lowercase-only random 16-char | ~75.2 bits | |
| 95-type random 8-char | ~52.6 bits |
A patterned 8-character password forced to include all 4 character types is times weaker than random lowercase 12 characters. “Include uppercase and symbols” raises attacker search cost far less than “make it 12+ characters.”
GPU Crack Times
How entropy differences play out in real attacks, converted from Hashcat benchmarks. A single RTX 4090 achieves roughly 164 billion hashes/sec for MD5 and about 5,750 hashes/sec for bcrypt (cost=10). Algorithm choice alone changes speed by 7 orders of magnitude.
Full search times for each password type at these speeds:
| Password type | Entropy | MD5 | bcrypt (cost=10) |
|---|---|---|---|
| 4-type forced patterned 8-char | ~26 bits | 0.0004 sec | ~3 hours |
| Lowercase-only random 8-char | ~37.6 bits | ~1.3 sec | ~1.2 years |
| Lowercase-only random 12-char | ~56.4 bits | ~6.7 days | ~520,000 years |
| 95-type random 8-char | ~52.6 bits | ~11 hours | ~37,000 years |
A user told to “include uppercase, numbers, and symbols” who creates Password1! can be cracked in 3 hours even with bcrypt.
A random lowercase 12-character string like vkrmxjqwpbtf takes nearly a week even stored as MD5.
What matters isn’t the number of character types but whether the password follows a pattern attackers can predict. Character-class enforcement maintains the theoretical space but biases the distribution humans choose. The constraint meant to widen the space ends up making the attacker’s job easier.
How I’d Spec This
If it were up to me, I’d cut password specs like this.
| Spec | Decision |
|---|---|
| User-initiated voluntary change | Include |
| Forced change on breach / suspicious login | Include |
| Password expiration | Don’t include by default |
| Admin bulk forced change | Include for incident response |
| Session invalidation on change | Optional, or forced for high-risk |
| Blocking breached passwords | Include |
| Character-class enforcement | Don’t include. Control via minimum length |
| MFA | Include wherever possible |
Not zeroing out “the ability to change passwords,” but stopping “change passwords because the clock says so.” Making this distinction makes the conversation much easier.
If regulations or contracts explicitly mandate periodic rotation, you comply within that scope. But if the spec decision is yours, the probability says to invest in detection, notification, MFA, and breached-password blocking rather than periodic rotation.