forked from privacycg/nav-tracking-mitigations
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.bs
829 lines (631 loc) · 41.2 KB
/
index.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
<pre class="metadata">
Title: Navigational-Tracking Mitigations
Shortname: nav-tracking-mitigations
Repository: privacycg/nav-tracking-mitigations
URL: https://privacycg.github.io/nav-tracking-mitigations/
Editor: Pete Snyder, w3cid 109401, Brave https://brave.com/, [email protected]
Editor: Jeffrey Yasskin, w3cid 72192, Google https://google.com/, [email protected]
Abstract: This specification defines navigational tracking and when and how browsers are required to prevent it from happening.
Status Text: This specification is intended to be migrated to the W3C standards track. It is not a W3C standard.
Text Macro: LICENSE <a href=http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document>W3C Software and Document License</a>
Group: privacycg
Status: CG-DRAFT
Level: None
Complain About: accidental-2119 yes, missing-example-ids yes
Markup Shorthands: markdown yes, css no
Assume Explicit For: yes
Metadata Order: !*, *, This version
!Participate: <a href="https://github.com/[REPOSITORY]">Github Repository</a>
!Participate: <a href="https://github.com/privacycg/meetings/">Privacy CG Meetings</a>
</pre>
<pre class="biblio">
{
"FSN-2021-Q4": {
"title": "Firefox Security Newsletter 2021 Q4",
"href": "https://wiki.mozilla.org/Firefox_Security_Newsletter/FSN-2021-Q4",
"date": "2022-03-09"
},
"MOZILLA-TRACKING-POLICY": {
"title": "Mozilla Anti Tracking Policy",
"href": "https://wiki.mozilla.org/Security/Anti_tracking_policy"
},
"WEBKIT-TRACKING-PREVENTION": {
"href": "https://webkit.org/tracking-prevention/",
"title": "Tracking Prevention in WebKit"
}
}
</pre>
<pre class="anchors">
spec: HTTP; urlPrefix: https://httpwg.org/specs/rfc7231.html#
type: dfn; text: HTTP 3xx statuses; url: status.3xx
spec: tracking-dnt; urlPrefix: https://www.w3.org/TR/tracking-dnt/#
type: dfn; text: tracking; url: dfn-tracking
spec: RFC6265; urlPrefix: https://tools.ietf.org/html/rfc6265/
type: dfn
text: cookie store; url: section-5.3
text: domain-match; url: section-5.1.3
spec: RFC7234; urlPrefix: https://tools.ietf.org/html/rfc7234/
type: dfn
text: network cache; url: section-2
</pre>
<section class="non-normative">
<h2 id="intro">Introduction</h2>
<em>This section is non-normative.</em>
Browsers are working to prevent cross-site [=tracking=], which threatens user
privacy. In addition to third-party cookies and storage, other client-side
methods exist that enable cross-site tracking. [=Navigational tracking=]
correlates user identities across sites during navigations between those sites.
[=Navigational tracking=] uses [=link decoration=] to convey information, but
not all [=link decoration=] is tracking. This project attempts to distinguish
tracking from non-tracking navigation and to prevent the tracking without
damaging similar but benign navigations.
</section>
<h2 id="infra">Infrastructure</h2>
This specification depends on the Infra standard. [[!INFRA]]
<h2 id="terminology">Terminology</h2>
<dfn>Link decoration</dfn> is when the source of a [=hyperlink=] "decorates" its [=URL=]
with extra information beyond what's necessary to identify the page a user wants
to navigate to. This information can be placed almost anywhere inside the URL.
<dfn>Navigational tracking</dfn> refers to the general use of one or more
[[HTML#navigating-across-documents|navigations]] to identify that a user on one
site is the same person as a user on another site. Navigations transmit information
cross-site in a few different ways, including in the target URL, which might be
[=link decoration|decorated=], and in the timing of the request.
<div class="example" id="example-link-decoration-tracking">
<style>
#example-link-decoration-tracking code em {background-color: cyan}
</style>
Examples and non-examples of [=link decoration=] and [=navigational tracking=],
with the potential decoration or tracking element emphasized:
: <code>https://publisher.example/page?userId=<em>5789rhkdsaf8urfnsd</em></code>
:: [=Link decoration=], and also [=navigational tracking=].
: <code>https://bookshop.org/a/<em>1122</em>/9780062252074</code>
:: [=Link decoration=] but **not** [=navigational tracking=]: This number
identifies an affiliate to credit with a book sale. Replacing this with
another number gets to the same target page.
: <code>https://bookshop.org/a/1122/<em>9780062252074</em></code>
:: **Not** decoration: This number identifies a particular book. Changing it
yields a different target page.
: <code>https://bugzilla.mozilla.org/show_bug.cgi?id=<em>1460058</em></code>
:: **Not** decoration: changing the number changes which bug the user sees.
: <code>https://www.google.com/maps/@<em>37.4220328,-122.0847584,17.12z</em></code>
:: Issue(4): Changing the numbers changes what map the user sees, and embedding
a user ID would not successfully transfer that user ID to the target site,
but it's hard for an automated system inside a browser to prove
that, and even hard for humans reading the URL to be confident of it.
: <code>https://publisher.example/unsubscribe?userId=<em>5789rhkdsaf8urfnsd</em></code>
:: Issue(5): The URL identifies an action rather than a page, and the user ID
might be essential for that action to happen. However, this is also clearly
a user ID and sufficient to track a user if the source and target
collaborate.
: <code>https://example.com/auth/callback?token=<em>1234567</em></code>
:: Issue(5): This is probably the same case as the unsubscribe link.
: <code>https://example.com/login?returnto=<em>item/12345</em></code>
:: Assuming a request for this URL shows a login page instead of immediately
redirecting to `item/12345`, this is a [=link decoration=] but not
[=navigational tracking=].
</div>
<dfn>Bounce tracking</dfn> refers to the use of redirects in a top-level context
(including [=HTTP 3xx statuses=], <{meta}> elements with
<{meta/http-equiv}>=<{meta/http-equiv/refresh}> attributes, and script-directed
navigation that doesn't wait for user input) along with [=link decoration=] to
join user identities between sites. [=Bounce tracking=] is a subset of
[=navigational tracking=] and can include automated navigation through the same
or different sites from the source or ultimate destination of a link.
<div class="example" id="example-bounce-tracking-to-self">
Tracking via a bounce through an aggregation domain:
1. The content publisher's page (on `publisher.example`) embeds a third-party
script from `tracker.example`.
1. The third-party script tries to read an already-stored identifier, for
example one it has set into `publisher.example`'s storage or one read from a
third-party `tracker.example` <{iframe}>.
1. If it can't, it redirects the top level page to `tracker.example` using
{{Window/location|window.location}}.
1. During this load `tracker.example` is the first party and can read and write
its cookie jar.
1. `tracker.example` redirects back to the original page URL, with that URL
[=link decoration|decorated=] with its user ID in a query parameter.
1. The `tracker.example` user ID is now available on `publisher.example` and can
be saved into its first-party storage so that future visits don't need to
bounce.
</div>
<h2 id="threat-model">Threat model</h2>
This section will precisely define the goals and non-goals of this
specification's mitigations. It will define a few classes of actors with the
ability to modify websites in particular ways. Then it will define what
cross-site information each of these actors can or cannot learn.
<h3 id="threat-actors">Threat actors</h2>
TODO
<section class="informative">
<h2 id="alternatives">Considered Alternatives</h2>
<em>This section is non-normative.</em>
So far, the alternative designs consist of mitigations that various browsers
have already deployed.
<h3 id="deployed-mitigations">Deployed Mitigations</h3>
Some browsers have deployed and announced protections against
[=navigational tracking=]. This section is a work in progress to detail what
protections have been shipped and / or are planned. This section is not
comprehensive.
<h4 id="mitigations-safari">Safari</h4>
Safari uses an algorithmic approach to combat [=navigational tracking=]. Safari
classifies a site as having cross-site tracking capabilities if the
[[WEBKIT-TRACKING-PREVENTION#classification-as-having-cross-site-tracking-capabilities|following
criteria]] are met within a particular client:
* The site appears as a third-party resource under enough different
[=host/registrable domains=].
* The site automatically redirects the user to enough other sites, immediately
or after a short delay.
* The site redirects to sites that are classified as trackers, recursively.
<div class="example" id="example-safari-recursive-trackers">
For example, consider the case of a user clicking on a link on
`start.example`, which redirects to `second.example`, which redirects to
`third.example`, which redirects to `end.example`. If Safari has classified
`third.example` as having tracking capabilities, the above behavior can
result in Safari classifying `second.example` as having cross-site tracking
capabilities.
</div>
If a user navigates or is redirected from a classified tracker with a URL that
includes either query parameters or a URL fragment, the lifetime of client-side
set cookies on the *destination* page is capped at
[[WEBKIT-TRACKING-PREVENTION#detection-of-cross-site-tracking-via-link-decoration|24
hours]].
<h4 id="mitigations-firefox">Firefox</h4>
Firefox uses a list-based approach to combat [=navigational tracking=]. Sites on the
Disconnect list are considered tracking sites. All storage
for tracking sites is cleared after 24 hours, unless the user has interacted
with the site in the first-party context in the last 45 days.
Firefox is also starting to remove query parameters known to be used for
cross-site tracking. ([[FSN-2021-Q4]]) The affected query parameters are chosen
using the criteria on the [[MOZILLA-TRACKING-POLICY inline]], which includes:
* High-entropy parameters that might identify a user or encode user data,
except:
* Parameters exclusively identifying specific elements or actions on the
navigating page (per-click or per-element identifiers), as long as those
parameters assign a different value to each click or element they are
identifying.
* Identifiers necessary to complete a user-initiated task such as logging in
or submitting a form.
* High-entropy parameters that are broadly included in nearly all outgoing
navigations from a site, even if the parameters don't uniquely identify a
user.
As of May 2022, this query-parameter stripping is applied by default in the
Firefox Nightly build, and planned to be enabled in strict <abbr
title="Enhanced Tracking Protection">ETP</abbr> mode and in private browsing.
<h4 id="mitigations-brave">Brave</h4>
Brave uses four list-based approaches to combat [=navigational tracking=].
First, Brave strips query parameters commonly used for [=navigational tracking=]
from URLs on navigation. This list is maintained by Brave.
Second, by default, when i) the user is about to visit a list-identified
bounce-tracking URL, and ii) the current profile does not contain any cookies
or {{WindowLocalStorage/localStorage}} for that site, Brave will create a new, "ephemeral", empty storage
area for the site. This storage area persists as long as the user has
any top-level frames open for the site. As soon as the user has no
top-level frames for the labeled bounce-tracking site, the ephemeral storage
area is deleted.
Third, in the non-default, "aggressive blocking" configuration, Brave uses
popular crowd-sourced filter lists (e.g., EasyList, EasyPrivacy, uBlock Origin)
to identify URLs that are used for bounce tracking, and will preempt the
navigation with an interstitial (similar to Google SafeBrowsing), giving
the user the option to continue the navigation or cancel it.
Fourth, Brave uses a list-based approach for identifying bounce tracking
URLs where the destination URL is present in the URL of the intermediate
tracking URL. In such cases, Brave will skip the intermediate navigation
and request the destination URL instead. For example, if Brave
Browser observes the user about to navigate to the URL
`https://tracker.example/bounce?dest=https://destination.example/`,
the browser might replace the navigation to `tracker.example/bounce`,
with a navigation to `https://destination.example/`. This list
is maintained by Brave, and is drawn from a mix of crowd-sourcing and
existing open-source projects.
</section>
<h2 id="bounce-tracking-mitigations">Bounce Tracking Mitigations</h2>
The content of this section will provide a "monkey patch" specification for bounce tracking
mitigations. There is a [Chromium-oriented
explainer](https://github.com/privacycg/nav-tracking-mitigations/blob/main/bounce-tracking-explainer.md)
for this work, but the text in this section is intended for adoption across all browsers. This
section is not complete yet, and as the algorithms are developed, they will be specified here and
presented for review.
This spec is written for the case where the following features are enabled:
* All third-party cookies are blocked.
* All cookies written by embedded third-party sites are partitioned.
* All storage written by embedded third-party sites is partitioned.
<p class=note>
The following is a work-in-progress and does not yet reflect any consensus in
the PrivacyCG.
</p>
<h3 id="bounce-tracking-mitigations-data-model">Data Model</h3>
<h4 id="bounce-tracking-mitigations-data-model-global">Global Data</h4>
The user agent holds a <dfn>user activation map</dfn> which is a [=map=] of
[=site=] [=hosts=] to [=moments=]. The [=moments=] represent the most recent
[=wall clock=] time at which the user activated a top-level document on the
associated [=site=] [=host=].
The user agent holds a <dfn>stateful bounce tracking map</dfn> which is a
[=map=] of [=site=] [=hosts=] to [=moments=]. The [=moments=] represent the
first [=wall clock=] time since the last execution of the
[=bounce tracking timer=] at which a page on the given [=site=] [=host=] performed
an action that could indicate stateful bounce tracking took place. For example,
if [=bounce tracking timer=] ran at time X and bounces occurred at times X-1,
X+1, and X+2, then the map value would be X+1.
Note: Schemeless site is used as the data structure key because by default cookies
are sent to both `http://` and `https://` pages on the same domain.
Note: Hosts are eagerly removed from the [=stateful bounce tracking map=] when
a user activation occurs. This means that a given host can exist in either the
[=user activation map=] or [=stateful bounce tracking map=], but not both at
the same time. The maps will have non-overlapping sets of keys.
<h4 id="bounce-tracking-mitigations-data-model-per-tab">Per-Tab Data</h4>
An <dfn>extended navigation</dfn> is a contiguous sequence of navigations within a single [=top-level traversable=],
joined by client-side redirects, which the user agent expects its user to perceive as a single operation.
Each [=top-level traversable=] has an associated
<dfn for="top-level traversable">bounce tracking record</dfn> which is a
[=bounce tracking record=] or null. This stores the data relevant
to bounce tracking enforcement for every [=extended navigation=].
It is non-null during an [=extended navigation=] and returns to null when the UA stops waiting for further
client-side redirects that might extend the current [=extended navigation=].
A <dfn>bounce tracking record</dfn> is a [=struct=] whose items are:
<dl dfn-for="bounce tracking record">
<dt><dfn>initial host</dfn></dt>
<dd>A [=site=]'s [=host=]. The initiator site of the current [=extended navigation=].</dd>
<dt><dfn>final host</dfn></dt>
<dd>A [=site=]'s [=host=] or null. The destination of the current [=extended navigation=]. Updated after every document load.</dd>
<dt><dfn>bounce set</dfn></dt>
<dd>A [=set=] of [=sites=]' [=hosts=]. All server-side and client-side redirects hit during this [=extended navigation=].</dd>
<dt><dfn>storage access set</dfn></dt>
<dd>A [=set=] of [=sites=]' [=hosts=]. All sites which accessed storage during this [=extended navigation=].</dd>
</dl>
<h4 id="bounce-tracking-mitigations-data-model-constants">Constants</h4>
The <dfn>bounce tracking grace period</dfn> is an [=implementation-defined=]
[=duration=] that represents the length of time after a possible bounce tracking
event during which the user agent will wait for an interaction before deleting a
[=site=] [=host=]'s storage.
Note: 1 hour is a reasonable [=bounce tracking grace period=] value.
The <dfn>bounce tracking activation lifetime</dfn> is an
[=implementation-defined=] [=duration=] that represents how long user
activations will protect a [=site=] [=host=] from storage deletion.
Note: 45 days is a reasonable [=bounce tracking activation lifetime=] value.
The <dfn>bounce tracking timer period</dfn> is an [=implementation-defined=]
[=duration=] that represents how often to run the [=bounce tracking timer=]
algorithm.
Note: 1 hour is a reasonable [=bounce tracking timer period=] value.
The <dfn>client bounce detection timer period</dfn> is an [=implementation-defined=]
[=duration=] that represents how long to wait for a client redirect after a navigation ends.
The purpose is to catch all automated page-triggered redirects, which should be appended to
the current [=extended navigation=], with high probability.
Note: 10 seconds is a reasonable [=client bounce detection timer period=] value.
<h3 id="bounce-tracking-mitigations-algorithms">Algorithms</h3>
<h4 id="bounce-tracking-mitigations-activation-monkey-patch">User Activation
Monkey Patch</h4>
<div algorithm>
To <dfn>record a user activation</dfn> given a [=Document=] |document|, perform
the following steps:
1. Let |navigable| be |document|'s [=node navigable=].
1. If |navigable| is null, then abort these steps.
1. Let |topDocument| be |navigable|'s [=top-level traversable=]'s
[=navigable/active document=].
1. Let |origin| be |topDocument|'s [=Document/origin=].
1. If |origin| is an [=opaque origin=] then abort these steps.
1. Let |site| be the result of running [=obtain a site=] given |origin|.
1. Let |host| be |site|'s [=host=].
1. [=map/Remove=] |host| from the [=stateful bounce tracking map=].
1. Set [=user activation map=][|host|] to |topDocument|'s
[=relevant settings object=]'s
[=environment settings object/current wall time=].
</div>
Append the following steps to the <a spec="html">activation notification</a>
steps in the [[HTML#user-activation-processing-model|user activation processing
model]]:
1. Run [=record a user activation=] given <var ignore>document</var>.
<h4 id="bounce-tracking-mitigations-web-authentication-monkey-patch">Web Authentication Monkey Patch</h4>
A successful credential access, using the Web Authentication API, is also treated as a user activation for bounce-tracking purposes.
Add the following argument to {{PublicKeyCredential}}'s {{PublicKeyCredential/[[DiscoverFromExternalSource]]()}} algorithm:
<dl dfn-for="PublicKeyCredential/[[DiscoverFromExternalSource]](origin, options, sameOriginWithAncestors)">
<dt>browsingContext</dt>
<dd>The caller's [=environment's=] [=environment/target browsing context=]</dd>
</dl>
Insert the following steps in the {{PublicKeyCredential/[[DiscoverFromExternalSource]]()}} algorithm
in step 17, under "If any authenticator indicates success":
1. Run [=process a Web Authentication assertion for bounce tracking mitigations=] given <var ignore>callerOrigin</var> and <var ignore>browsingContext</var>.
<div algorithm>
To <dfn>process a Web Authentication assertion for bounce tracking mitigations</dfn> given
a [=browsing context=] |browsingContext|, perform the following steps:
1. If |browsingContext| is null, then abort these steps.
1. Let |topDocument| be |browsingContext|'s active document.
1. Let |origin| be |topDocument|'s [=Document/origin=].
1. If |origin| is an [=opaque origin=] then abort these steps.
1. Let |site| be the result of running [=obtain a site=] given |origin|.
1. Let |host| be |site|'s [=host=].
1. Set [=user activation map=][|host|] to |topDocument|'s
[=relevant settings object=]'s
[=environment settings object/current wall time=].
</div>
Note: It's also reasonable to treat signin to a browser-integrated account as a user activation for
bounce-tracking purposes. Such an interaction could be stored in the [=user activation map=]
for the host of the identity provider and/or the host domain of the account.
<h4 id="bounce-tracking-mitigation-stateful-bounce-detection">Stateful Bounce
Detection</h4>
<h5 id="bounce-tracking-mitigations-nav-start-monkey-patch">Start Navigation
Monkey Patch</h5>
At the start of a navigation, either initialize a new [=bounce tracking record=],
or append a client-side redirect to the current [=bounce tracking record=].
Insert the following steps in the <a spec="html">navigate</a> algorithm before step 8,
"Navigate to a fragment given navigable, url, historyHandling, and navigationId."
1. Run [=process navigation start for bounce tracking=] given
<var ignore>navigable</var>, <var ignore>sourceDocument</var>, and <var ignore>sourceSnapshotParams</var>.
<div algorithm>
To <dfn>process navigation start for bounce tracking</dfn> given a [=navigable=]
|navigable|, [=Document=] |sourceDocument|, and <a spec="html">source snapshot params</a> |sourceSnapshotParams|, perform the following steps:
1. If |navigable| is not a [=top-level traversable=], then abort these steps.
1. Remove any queued global tasks to [=record stateful bounces for bounce tracking=] from the [=networking task source=].
1. Let |origin| be |sourceDocument|'s [=Document/origin=].
1. If |origin| is an [=opaque origin=], set |initialHost| to [=empty host=].
1. Otherwise,
1. Let |site| be the result of running [=obtain a site=] given |origin|.
1. Set |initialHost| to |site|'s [=host=].
1. If |navigable|'s [=top-level traversable/bounce tracking record=] is null:
1. Set |navigable|'s [=top-level traversable/bounce tracking record=] to a new
[=bounce tracking record=] with [=bounce tracking record/initial host=]
set to |initialHost|.
Note: This includes the case where the current navigation was initiated by another navigable, e.g. when opening
a link in a new tab. In this case, |sourceDocument| is set to the opener Document, and the new bounce tracking
record has its [=bounce tracking record/initial host=] set to the opener host. This ensures that trackers opened
in new tabs are detected as distinct from the [=bounce tracking record/initial host=] in the new [=bounce tracking record=].
1. Otherwise,
1. If |sourceSnapshotParams|'s <a spec="html">has transient activation</a> is true:
1. Run [=record stateful bounces for bounce tracking=] given |navigable|'s [=navigable/active document=]'s [=relevant global object=].
1. Set |navigable|'s [=top-level traversable/bounce tracking record=] to a new [=bounce tracking record=] with
[=bounce tracking record/initial host=] set to |initialHost|.
1. Otherwise, add |initialHost| to |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/bounce set=].
</div>
<h5 id="bounce-tracking-mitigations-network-cookie-write-monkey-patch">Network
Cookie Write Monkey Patch</h5>
Each [=top-level traversable=] maintains a record of which sites it has saved cookies for in the current [=extended navigation=].
Insert the following steps in the <a spec="fetch">HTTP-network fetch</a> algorithm after step 15,
"... run the "set-cookie-string" parsing algorithm (see section 5.2 of [[COOKIES]]) ...":
1. If cookies were stored in the cookie store in the previous step, then
run [=process a fetch storage access for bounce tracking mitigations=]
given <var ignore>request</var>.
Note: If the `Set-Cookie` header includes any cookies that the user agent ignores, for example because it's blocking third-party cookies,
they're not considered "stored in the cookie store" in this algorithm.
<div algorithm>
To <dfn>process a fetch storage access for bounce tracking mitigations</dfn>
given a [=request=] |request|, perform the following steps:
1. Let |origin| be |request|'s [=request/origin=].
1. If |origin| is an [=opaque origin=], then abort these steps.
1. If |request| is a [=subresource request=], then:
1. If |request|'s [=request/client=] is null, or
|request|'s [=request/client=]'s [=environment/target browsing context=]
is null, then abort these steps.
1. Let |topLevelTraversable| be |request|'s [=request/client=]'s
[=environment/target browsing context=]'s
[=browsing context/top-level traversable=].
1. Otherwise,
1. If |request|'s [=request/reserved client=] is null, or
|request|'s [=request/reserved client=]'s [=environment/target browsing context=]
is null, then abort these steps.
1. Let |topLevelTraversable| be |request|'s [=request/reserved client=]'s
[=environment/target browsing context=]'s
[=browsing context/top-level traversable=].
1. If |topLevelTraversable|'s [=top-level traversable/bounce tracking record=]
is null, abort these steps.
1. Let |site| be the result of running [=obtain a site=] given |origin|.
1. [=set/Append=] |site|'s [=host=] to |topLevelTraversable|'s
[=top-level traversable/bounce tracking record=]'s
[=bounce tracking record/storage access set=].
</div>
Note: We currently don't treat cookie reads as stateful, but this would be a
reasonable future change. We could run [=process a fetch storage access for bounce tracking mitigations=]
in the <a spec="fetch">HTTP-network-or-cache fetch</a> algorithm after step 8.21.1.2,
"... [=append=] (`Cookie`, cookies) to httpRequest’s [=header list=]. ..."
<h5 id="bounce-tracking-mitigations-service-worker-activation-monkey-patch">Service Worker Activation Monkey Patch</h5>
Each [=top-level traversable=] maintains a record of which sites have activated service workers in the current [=extended navigation=].
Insert the following steps in the [=Handle Fetch=] algorithm after step 23,
"If the result of running the Run Service Worker algorithm...":
1. Run [=process a fetch storage access for bounce tracking mitigations=] given <var ignore>request</var>.
<h5 id="bounce-tracking-mitigations-storage-access-monkey-patch">Storage Access Monkey Patch</h5>
Each [=top-level traversable=] maintains a record of which sites have accessed storage in the current [=extended navigation=].
Insert the following steps in the <a spec="storage">obtain a storage bottle map</a> algorithm before step 10, "Return <var ignore>proxyMap</var>":
1. Run [=process a general storage access for bounce tracking mitigations=] given <var ignore>environment</var>.
Issue(whatwg/storage#165): This patch has to be run whenever a site accesses non-cookie storage.
<a spec="storage">Obtain a storage bottle map</a> is the intended hook for this, but it does not currently have full coverage across specs that use storage.
So this patch is not comprehensive.</p>
<div algorithm>
To <dfn>process a general storage access for bounce tracking mitigations</dfn>
given an [=environment=] |environment|, perform the following steps:
1. If |environment| is not an [=environment settings object=], then abort these steps.
Note: At time of writing, <a spec="storage">obtain a storage bottle map</a> can only accept an [=environment settings object=] |environment|,
but this will be refactored to support [=service workers=] which attempt to access storage on every navigation, and thus is not considered
when updating the [=bounce tracking record/storage access set=].
2. Let |origin| be |environment|'s [=environment/top-level origin=].
3. If |origin| is null or an [=opaque origin=], then abort these steps.
4. Let |global| be |environment|'s [=environment settings object/realm execution context=]'s [=global object=].
5. Let |navigables| be an [=set/empty=] [=set=] of [=navigables=].
6. If |global| is a [=Window=] object, [=set/append=] |global|'s [=associated document=]'s [=node navigable=] onto |navigables|.
7. Otherwise, if |global| is a {{WorkerGlobalScope}} object,
1. Let |ownerQueue| be an [=queue/empty=] [=queue=] of [=document=] or {{WorkerGlobalScope}} objects.
1. [=queue/Enqueue=] |global| onto |ownerQueue|.
1. [=iteration/While=] |ownerQueue| is not [=queue/empty=],
1. [=queue/Dequeue=] |owner| from |ownerQueue|.
1. If |owner| is a [=document=] object, [=set/append=] |owner|'s [=node navigable=] onto |navigables|.
1. If |owner| is a {{WorkerGlobalScope}} object, then [=set/For each=] |owner| in |global|'s [=WorkerGlobalScope/owner set=],
[=queue/enqueue=] |owner| onto |ownerQueue|.
Note: Handling {{WorkerGlobalScope}} covers all storage access from a dedicated worker ({{DedicatedWorkerGlobalScope}}) or a shared worker
({{SharedWorkerGlobalScope}}). This doesn't apply to service workers, which rely on [=process a fetch storage access for bounce tracking mitigations=]
during Fetch events and [=process a general storage access for bounce tracking mitigations=] with a [=Window=] object when a service worker is
accessed using navigator.serviceWorker.getRegistration().
8. [=set/For each=] |navigable| in |navigables|:
1. If |navigable| is not a [=top-level traversable=], then abort these steps.
1. If |navigable|'s [=top-level traversable/bounce tracking record=] is null, then abort these steps.
1. Let |site| be the result of running [=obtain a site=] given |origin|.
1. [=set/Append=] |site|'s [=host=] to |navigable|'s
[=top-level traversable/bounce tracking record=]'s
[=bounce tracking record/storage access set=].
</div>
<h5 id="bounce-tracking-mitigations-response-received-monkey-patch">Response Received Monkey
Patch</h5>
When the response is received at the end of a navigation, fill the [=bounce tracking record/bounce set=].
Insert the following steps in the <a spec="html">create navigation params by fetching</a> algorithm, after Step 19.7,
"Wait until either response is non-null...", which is the point that [=response/url list=] becomes available.
1. Run [=process response received for bounce tracking=] given
<var ignore>navigable</var> and <var ignore>response</var>'s <var ignore>URL list</var>.
<div algorithm>
To <dfn>process response received for bounce tracking</dfn> given a [=navigable=]
|navigable| and a [=list=] of [=URLs=] |URLs|, perform the following steps:
1. If |navigable| is not a [=top-level traversable=], then abort these steps.
1. [=Assert=]: |navigable|'s [=top-level traversable/bounce tracking record=] is not null.
1. Let |global| be |navigable|'s [=navigable/active document=]'s [=relevant global object=].
1. [=Run steps after a timeout=] given:
<dl>
<dt><var ignore>global</var></dt>
<dd>|global|</dd>
<dt><var ignore>milliseconds</var></dt>
<dd>[=client bounce detection timer period=]</dd>
<dt><var ignore>completionSteps</var></dt>
<dd>[=queue a global task=] on the [=networking task source=] with |global| to [=record stateful bounces for bounce tracking=] given |global|</dd>
</dl>
1. [=list/For each=] |URL| in |URLs|:
1. Let |site| be the result of running [=obtain a site=] given |URL|.
1. Let |host| be the |site|'s [=host=].
1. [=Insert=] |host| to the |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/bounce set=].
</div>
<h5 id="bounce-tracking-mitigations-document-load-monkey-patch">Document Loaded Monkey
Patch</h5>
When the document is loaded at the end of a navigation, update the [=bounce tracking record/final host=].
Note: The [=bounce tracking record/final host=] is updated later than [=process response received for bounce tracking=]
to ensure that the [=bounce tracking record/final host=] is a valid user destination. This is the one that has to be
compared against when deciding to exempt a [=host=] in the [=record stateful bounces for bounce tracking=] algorithm.
Insert the following steps in the <a spec="html">load a document</a> algorithm,
before Step 5, "Return null", which is the point that the [=document=] is loaded.
1. Run [=process document load for bounce tracking=] given
<var ignore>navigable</var> and <var ignore>response</var>'s <var ignore>URL list</var>.
<div algorithm>
To <dfn>process document load for bounce tracking</dfn> given a [=navigable=]
|navigable| and a [=list=] of [=URLs=] |URLs|, perform the following steps:
1. If |navigable| is not a [=top-level traversable=], then abort these steps.
1. [=Assert=]: |navigable|'s [=top-level traversable/bounce tracking record=] is not null.
1. If |URLs| is empty, then abort these steps.
1. Let |finalSite| be the result of running [=obtain a site=] given the last entry in |URLs|.
1. Set the |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/final host=] to the [=host=] of |finalSite|.
</div>
<h5 id="bounce-tracking-mitigations-navigation-state-diagram">Navigation State Diagram</h5>
<img src="diagrams/navigation_state_diagram.png" alt="Navigation state diagram">
<h4 id="bounce-tracking-mitigations-timers">Timers</h4>
Every [=bounce tracking timer period=] the user agent should run the
[=bounce tracking timer=] algorithm given the [=wall clock=]'s
[=wall clock/unsafe current time=].
Note: Running the deletion algorithm on a global timer has the effect of adding fuzz to the
delay between stateful bounce and data deletion. This mitigates adversarial leaks of
interaction and bounce times. (See "Privacy and Security Considerations", below.)
<div algorithm>
To run the <dfn>bounce tracking timer</dfn> algorithm given a [=moment=] on the
[=wall clock=] |now|, perform the following steps:
1. [=map/For each=] |host| -> |activationTime| of [=user activation map=]:
1. [=Assert=] that [=stateful bounce tracking map=] does not
[=map/contain=] |host|.
1. If |activationTime| + [=bounce tracking activation lifetime=] is before
|now|, then [=map/remove=] |host| from [=user activation map=].
1. [=map/For each=] |host| -> |bounceTime| of [=stateful bounce tracking map=]:
1. [=Assert=] that [=user activation map=] does not [=map/contain=] |host|.
1. If |bounceTime| + [=bounce tracking grace period=] is after |now|, then
[=iteration/continue=].
1. If there is a [=top-level traversable=] whose
[=navigable/active document=]'s [=Document/origin=]'s
[=obtain a site|site=]'s [=host=] equals |host|,
then [=iteration/continue=].
1. [=map/Remove=] |host| from [=stateful bounce tracking map=].
1. [=Clear cookies for host=] given |host|.
1. [=Clear non-cookie storage for host=] given |host|.
1. [=Clear cache for host=] given |host|.
<p class=issue>TODO: Consider if we should do anything when the clock is moved
forward or backward.</p>
</div>
This algorithm is called when detecting the end of an [=extended navigation=]. This could happen
if a user-initiated navigation is detected in [=process navigation start for bounce tracking=],
or if the client bounce detection timer expires after [=process response received for bounce tracking=]
without observing a client redirect.
<div algorithm>
To run the <dfn>record stateful bounces for bounce tracking</dfn> algorithm
given a [=global object=] |global|, perform the following steps:
1. Let |navigable| be |global|'s [=associated document=]'s [=node navigable=].
1. If |navigable| is null, abort these steps.
This ensures that the global has not been detached from the navigable.
1. Let |topDocument| be |navigable|'s [=navigable/active document=].
1. [=Assert=]: |topDocument| is the same as |global|'s [=associated document=].
1. [=Assert=]: |navigable|'s [=top-level traversable/bounce tracking record=] is not null.
1. [=set/For each=] |host| in |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/bounce set=]:
1. If |host| [=host/equals=] |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/initial host=], [=iteration/continue=].
1. If |host| [=host/equals=] |navigable|'s [=top-level traversable/bounce tracking record=]'s [=bounce tracking record/final host=], [=iteration/continue=].
1. If [=user activation map=] [=map/contains=] |host|, [=iteration/continue=].
1. If [=stateful bounce tracking map=] [=map/contains=] |host|, [=iteration/continue=]. (Only the first bounce time since the
last execution of the [=bounce tracking timer=] is tracked in the map.)
1. If |navigable|'s [=top-level traversable/bounce tracking record=]'s
[=bounce tracking record/storage access set=] does not [=set/contain=] |host|, [=iteration/continue=].
1. Let |topDocument| be |navigable|'s [=navigable/active document=].
1. Set [=stateful bounce tracking map=][|host|] to |topDocument|'s
[=relevant settings object=]'s [=environment settings object/current wall time=].
1. Set |navigable|'s [=top-level traversable/bounce tracking record=] to null.
</div>
<h4 id="bounce-tracking-mitigations-deletion">Deletion</h4>
<p class=note>The cookie and cache clearing algorithms were largely copied from
the [[clear-site-data|Clear Site Data]]
spec. It would be nice to unify these in the future.</p>
<div algorithm>
To <dfn>clear cookies for host</dfn> given a [=host=] |host|, perform the
following steps:
1. Let |cookieList| be the set of cookies from the [=cookie store=] whose
domain attribute is a [=domain-match=] with |host|.
1. [=list/For each=] |cookie| in |cookieList|:
1. Remove |cookie| from the [=cookie store=].
</div>
Issue: TODO: Verify that using [=domain-match=] catches all cookies for the
[=host/registrable domain=].
<div algorithm>
To <dfn>clear non-cookie storage for host</dfn> given a [=host=] |host|, perform
the following steps:
1. For each <a spec=storage>storage shed</a> |shed| held by the user agent or a
[=traversable navigable=]:
1. [=map/For each=] |storageKey| -> |storageShelf| of |shed|:
1. If |storageKey|'s <a spec=storage for="storage key">origin</a> is an
[=opaque origin=], then [=iteration/continue=].
1. If |storageKey|'s <a spec=storage for="storage key">origin</a>'s
[=origin/host=] does not equal |host|, then [=iteration/continue=].
1. Delete all data stored in |storageShelf|.
1. [=map/Remove=] |storageKey| from |shed|.
</div>
<div algorithm>
To <dfn>clear cache for host</dfn> given a [=host=] |host|, perform the
following steps:
1. Let |cacheList| be the set of entries from the [=network cache=] whose
target URI [=host=] equals |host|.
1. [=list/For each=] |entry| in |cacheList|:
1. Remove |entry| from the [=network cache=].
</div>
<h2 id="privacy-and-security-considerations">Privacy and Security Considerations</h2>
This feature stores information about sites that have a user interaction, some amount of view time, or
[engagement](https://www.chromium.org/developers/design-documents/site-engagement/). This information is
not directly exposed to sites, however, it can be indirectly observed. For example, if `tracker.example`
reports back its oldest existing state to `site1.example`, then `site1.example` could infer that
`tracker.example` has had an interaction. If it does not report any long-lived state, however, then
`site1.example` could infer that the state was wiped.
In addition, there are potential scenarios where the existence of an interaction could be accessed through
existing XS leaks in the platform. Consider a scenario where a target site has an existing endpoint that
causes an automatic redirect that triggers the bounce tracking mitigations. An attacker could use existing
XS leaks to determine if any logged-in state is present on a target site and then look to see if that state
disappears after triggering the bounce.
The information leak from tracker to site does not seem very significant. At a minimum, the proposed
mitigations do not make the situation worse and without the mitigations there is a greater potential for a
tracker to communicate state to the target site.
The cross-site adversarial information leak is more concerning. Solutions designed for this effort should
take this threat into account and attempt to mitigate it. For example, delaying or fuzzing the timing of
storage wiping could lessen the impact of the leak. Ultimately, though, it may be necessary to weigh the
cost of this 1 bit information leak against the gains in mitigating bounce tracking.
Another threat that should be considered by any solution is the possibility that an attacker will trick a
user into visiting an existing redirect endpoint on a site that they care about and somehow trigger their
storage to be deleted. The current plan, however, is to already look at signals that a user cares about a
site for legitimate use cases, in order to avoid wiping data they care about. This threat just puts more
weight on getting these signals correct.
Otherwise this effort does not store or expose any new types of information. It does not create any new
cross-origin communication or storage capabilities.
There is some implementation complexity to be aware of when wiping storage. The browser must be careful
not to wipe storage out from under a site actively using it. This could lead to poor interop and broken
web sites.
<h2 id="acknowledgements" class="no-num">Acknowledgements</h2>
Many thanks to the Privacy Community Group for many good discussions about this proposal.