Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch type combinations -- normative vs non-normative guidance #240

Open
skef opened this issue Nov 19, 2024 · 8 comments
Open

Patch type combinations -- normative vs non-normative guidance #240

skef opened this issue Nov 19, 2024 · 8 comments

Comments

@skef
Copy link
Contributor

skef commented Nov 19, 2024

I said I would add a discussion of patch map layouts to the encoder section and I still plan to do this. But I'm also wondering about the trade-offs between putting some hard limits on those patterns vs leaving options open. For example, these are the patterns I would expect to see based on the needs we've identified:

  • One set of partial or full invalidation patches in one (e.g. IFT) table. (There's really no difference between the two in such cases)
  • One set of partial invalidation patches in one table, one set of no invalidation patches in the other table.
  • One set of full invalidation patches in one table, one set of no invalidation patches in the other table. (Although this one is questionable.)
  • A combination of full and partial invalidation patches in one table, one set of no invalidation patches in another table.

There are many other possible combinations allowed by the specification. Raising questions like these:

  1. Can both tables have the same compatibility id, or is that not allowed?
  2. Can there be more than one set of full or partial invalidation patches? Presumably the idea of this would be to allow independent subsets of tables to be updated independently. Maybe there are cases where this would make sense, but given that we only have two tables to put patches in we don't really have strong support for this as things stand -- it's more that you could do it with a bunch of
    restrictions (like no glyph-keyed patches at the same time).
  3. Can you put no-invalidation patches into the same table as full invalidation patches? Seem like this would work but the use cases aren't strongly motivated.
  4. Can you put no-invalidation patches into the same table as partial invalidation patches? Seems like this wouldn't work, as you'd be changing the compatibility id for the partial invalidation patches such that the no-invalidation patches wouldn't make sense.

We can choose to handle this entirely with a non-normative discussion of what's like to make sense but I'd rather deal with some of it normatively. Part of the reason for this is that I don't think all of this comes "for free". For example, if you have two sets of full and/or partial invalidation patches, one in each table, there is a question of what patches are "allowed" to update the compatibility ids -- presumably each id would only be updated by patches of that table. (If we wanted to have a true arbitrary two-level system we'd have to change the extension algorithm, because it's just going to be picking among patches of the invalidation classes somewhat arbitrarily based on the intersections of the subset parameters.) We can say "people doing things outside the box just need to think through this stuff themselves", which is sort of what we've been doing, but that sidesteps the question of what we want people to be able to do. Some options might call for changes to the extension algorithm, for example.

@garretrieger
Copy link
Contributor

  1. Can both tables have the same compatibility id, or is that not allowed?

I ran into this during my client implementation and I've been meaning to make some clarifications in the spec around this. If you end up with an encoding with two tables having the same compat ID it seems like that should just be encoded as one table. So my current thinking is that we should just forbid this case by making it an error if the client encounters it.

  1. Can there be more than one set of full or partial invalidation patches? Presumably the idea of this would be to allow independent subsets of tables to be updated independently. Maybe there are cases where this would make sense, but given that we only have two tables to put patches in we don't really have strong support for this as things stand -- it's more that you could do it with a bunch of
    restrictions (like no glyph-keyed patches at the same time).

Yes, this is supported currently. For example it would be valid to have one set of partial inval patches in each of the two tables where those two sets modified disjoint sets of tables in the font.

  1. Can you put no-invalidation patches into the same table as full invalidation patches? Seem like this would work but the use cases aren't strongly motivated.

Yes, format 2 allows you to assign different patch formats per entry. Mixing full inval + no inval in the same table doesn't seem useful, but there are use cases where you would want to mix full and partial inval into the same table. For example a font that has a patch which extends design space would likely make use of a full invalidation patch when modifying the fonts design space, and partial invalidation patches when modifying code point coverage.

  1. Can you put no-invalidation patches into the same table as partial invalidation patches? Seems like this wouldn't work, as you'd be changing the compatibility id for the partial invalidation patches such that the no-invalidation patches wouldn't make sense.

As with 3 this is supported. The expectation would be that if one of the partial invalidation patches was applied it would by necessity need to change the no invalidation entries to point to new URLs that are compatible with the updated compatibility ID. In fact this setup is essentially functionally equivalent to a font that has two mapping tables with a full invalidation patch in one table and no invalidation patches in the other.

We can choose to handle this entirely with a non-normative discussion of what's like to make sense but I'd rather deal with some of it normatively.

My preference is to handle 1 with a normative change and 2 through 4 with non-normative guidance. The extension algorithm as currently written can successfully process cases 2 through 4 in a reasonable way as long as the encoding is compliant with the normative encoding requirements.

Part of the reason for this is that I don't think all of this comes "for free". For example, if you have two sets of full and/or partial invalidation patches, one in each table, there is a question of what patches are "allowed" to update the compatibility ids -- presumably each id would only be updated by patches of that table.

This is covered by Patch Invalidations a partial invalidation patch can only change the compatibility ID of the table containing the patch. Full invalidation patches will change both tables.

(If we wanted to have a true arbitrary two-level system we'd have to change the extension algorithm, because it's just going to be picking among patches of the invalidation classes somewhat arbitrarily based on the intersections of the subset parameters.)

That's right so if you have two sets of partial invalidation patches in two tables then the extension algorithm will pick first from one table or the other (if following the recommended guidance it would select the highest overall stuff you need patch). In these types of cases the consistency requirement ensures that the end result after extension finishes is the same regardless of the particular order the client chose.

We can say "people doing things outside the box just need to think through this stuff themselves", which is sort of what we've been doing, but that sidesteps the question of what we want people to be able to do. Some options might call for changes to the extension algorithm, for example.

This is the approach that I'd advocate for: In encoder guidance explain the typical way an encoder should approach encoding (ie. one set of partial and one set of no invalidation patches in separate tables) and then say if you're going outside of that framework you need to work through how the extension algorithm will process your encoding and ensure the encoding requirements are met with whatever you chose to do.

@garretrieger
Copy link
Contributor

One additional piece of normative requirements I think we should add, which is currently only implied:

  • Add an additional encoder requirement that each patch which is part of an encoding must follow the invalidation behaviour in Patch Invalidations

@skef
Copy link
Contributor Author

skef commented Nov 20, 2024

Yes, this is supported currently.

I should have been clearer in my language, when I've been saying "can" I mean "should it be the case that the spec allows this"?

For example, I know in the case of 4 that the spec currently allows the combination, but I don't think it makes sense.

In fact this setup is essentially functionally equivalent to a font that has two mapping tables with a full invalidation patch in one table and no invalidation patches in the other.

Except that a full invalidation patch may or may not change the compatibility id in the other table (it has the option to change it, and I suppose it should change it if it's claiming to be full invalidation), while the partial invalidation patch must change it by definition (that is, to invalidate its sibling patches, which is what makes it partial invalidation), and that change will apply to the no-invalidation patches as well. So effectively mixing partial and no-invalidation patches in the same table is equivalent to mixing full and no-invalidation patches in the same table, except that the partial invalidation patches aren't really partial-invalidation, they invalidate both.

So, more generally, the spec currently allows a number of combinations that violate our premises -- with patches of invalidation type de facto becoming patches of a different invalidation type because of how they're used.

@skef
Copy link
Contributor Author

skef commented Nov 20, 2024

For example it would be valid to have one set of partial inval patches in each of the two tables where those two sets modified disjoint sets of tables in the font.

Right, the question here is "why two"? The accurate answer is because we added a second table so that table keyed patches could be separate from glyph-keyed patches, so this is possible as a side effect. But why not 3? or 4? And if the answer is "there's no reason for 3 or 4", then is there a reason for 2?

@skef
Copy link
Contributor Author

skef commented Nov 20, 2024

but there are use cases where you would want to mix full and partial inval into the same table

Definitely, see bullet 4 of the issue.

@skef
Copy link
Contributor Author

skef commented Nov 20, 2024

This is covered by Patch Invalidations a partial invalidation patch can only change the compatibility ID of the table containing the patch. Full invalidation patches will change both tables.

Even so, I still think we have definitional problems if full and/or partial invalidation patches can be mixed with no-invalidation patches, because any partial invalidation patches are no longer partial invalidation.

@skef
Copy link
Contributor Author

skef commented Nov 20, 2024

Given the definitions, shouldn't we at least prohibit having full invalidation patches in multiple tables? Does it make sense to have two sets of patches that will necessarily change the compatibility id of the other table? So depending on the order we process the tables in (have we officially put IFT before IFTX -- I don' t see it in the algorithm) and what patch "wins" based on subsetting, that patch invalidates the other patches?

@garretrieger
Copy link
Contributor

garretrieger commented Nov 20, 2024

I should have been clearer in my language, when I've been saying "can" I mean "should it be the case that the spec allows this"?

My position is that yes we should allow these cases. I believe we should only prohibit specific cases when the extension algorithm in combination with normative encoder requirements can’t process them in a well defined manner. Of the cases presented so far only 1) where two tables have the same compat id appears problematic to me.

So, more generally, the spec currently allows a number of combinations that violate our premises -- with patches of invalidation type de facto becoming patches of a different invalidation type because of how they're used.
Even so, I still think we have definitional problems if full and/or partial invalidation patches can be mixed with no-invalidation patches, because any partial invalidation patches are no longer partial invalidation.

I think there may be a misunderstanding of how invalidation is defined. Some important points about how invalidation is defined:

  • Invalidation categories specify how the application of a patch affects the validity of other patches currently listed in the font.
  • Partial invalidation is defined as invalidating any patches regardless of their own invalidation mode that reside in the same mapping table.
  • When something is no invalidation that means it won’t invalidate other patches, but that does not imply it won’t be invalidated by a partial invalidation patch. Whether or not it’s invalidated by a partial invalidation patch depends solely on which mapping table it resides in.

So when a partial invalidation patch is mixed with no invalidation patches in the same table and application of the partial invalidation invalidates the no invalidation patches in that same table that’s working exactly as intended. It doesn’t represent the patch becoming a different invalidation category or violate our premises. Ultimately it's up to the encoder to place patches in the appropriate table to get whatever behaviour it desires.

On the client side an optimized implementation needs to consider the implications of the invalidation modes when it’s selecting the set of patches to start loads for. In the above case the client would not start loads for the no invalidation patches because it can see that they will be invalidated by the partial invalidation patch application. However, if there were some additional no invalidation patches in the second table those loads could be started as they are not affected by the partial invalidation patch.

Right, the question here is "why two"? The accurate answer is because we added a second table so that table keyed patches could be separate from glyph-keyed patches, so this is possible as a side effect. But why not 3? or 4? And if the answer is "there's no reason for 3 or 4", then is there a reason for 2?

Two tables is the minimum number needed to support the types of behaviour we want to enable efficient extension. Most importantly having two tables allows us to have part of the patch mapping be untouched by the application of a binary diff. This is challenging to do if only one mapping table is used. The current approach we’re using generalizes to more than two mapping tables, but I haven’t seen a need for more than two so for now the spec has only two.

Given the definitions, shouldn't we at least prohibit having full invalidation patches in multiple tables? Does it make sense to have two sets of patches that will necessarily change the compatibility id of the other table? So depending on the order we process the tables in (have we officially put IFT before IFTX -- I don' t see it in the algorithm) and what patch "wins" based on subsetting, that patch invalidates the other patches?

As long as the encoding meets the encoder requirements this case isn’t problematic. This isn’t much different then the case where both of those full invalidation patches reside in the same table (which is something we’re doing in the prototype encodings). Since full invalidation patches invalidate patches in both tables it’s not actually all that important which of the two tables they’re listed in (except in the cases where there are also partial invalidation patches in the mix). The consistency requirement ensures that it doesn't matter which one wins and is applied first. For reference the table keyed patch encoder pseudo code in encoding considerations section demonstrates how to make an encoding out of only full invalidation patches that respects the consistency requirement.

As for the question of selection order between IFT and IFTX since we union the two patch entry lists together before processing there for the most part there isn’t a preference for one table or the other. Selection will be performed by finding a patch entry which matches the recently added selection criteria. However, this does raise one ambiguity in the tie breaking step which currently states to select the entry listed first. If we have patches derived from both tables that we’re tie breaking for, then this is not well defined. I’ll need to add some additional text there to consider IFT entries as being before IFTX ones for the tie breaking step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants