Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alembic always detects changes on indexes that use to_tsvector() #1390

Open
caumons opened this issue Jan 9, 2024 · 6 comments
Open

Alembic always detects changes on indexes that use to_tsvector() #1390

caumons opened this issue Jan 9, 2024 · 6 comments

Comments

@caumons
Copy link

caumons commented Jan 9, 2024

Since I upgraded to Alembic 1.13.1, SQLAlchemy 2.0.25, Flask-SQLAlchemy 3.1.1, Flask 3.0.0 and Python 3.12, everytime I run flask db migrate Alembic detects unexisting changes on indexes that use to_tsvector() for full text search using PostgreSQL 16. It keeps creating migrations to drop and recreate those unchanged indexes again and again.

The Alembic output is:

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.ddl.postgresql] Detected sequence named 'post_reading_id_seq' as owned by integer column 'post_reading(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'chapter_comment_reaction_id_seq' as owned by integer column 'chapter_comment_reaction(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'page_id_seq' as owned by integer column 'page(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'book_reservation_id_seq' as owned by integer column 'book_reservation(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'category_search_id_seq' as owned by integer column 'category_search(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'chapter_reading_id_seq' as owned by integer column 'chapter_reading(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'post_comment_reaction_id_seq' as owned by integer column 'post_comment_reaction(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'alert_id_seq' as owned by integer column 'alert(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'contact_message_id_seq' as owned by integer column 'contact_message(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'chapter_reaction_id_seq' as owned by integer column 'chapter_reaction(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'broadcast_message_id_seq' as owned by integer column 'broadcast_message(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'snippet_id_seq' as owned by integer column 'snippet(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'text_search_id_seq' as owned by integer column 'text_search(id)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'post_reaction_id_seq' as owned by integer column 'post_reaction(id)', assuming SERIAL and omitting
INFO  [alembic.autogenerate.compare] Detected changed index 'ix_category_ts' on 'category': expression #1 "to_tsvector('spanish'::regconfig, (name::text || ' '::text) || hashtag::text)" to "to_tsvector('spanish', name || ' ' || hashtag)"
INFO  [alembic.autogenerate.compare] Detected changed index 'ix_chapter_ts' on 'chapter': expression #1 "to_tsvector('spanish'::regconfig, (((title::text || ' '::text) || teaser::text) || ' '::text) || raw_text)" to "to_tsvector('spanish', chapter.title || ' ' || teaser || ' ' || raw_text)"
INFO  [alembic.autogenerate.compare] Detected changed index 'ix_post_ts' on 'post': expression #1 "to_tsvector('spanish'::regconfig, (((title::text || ' '::text) || teaser::text) || ' '::text) || raw_text)" to "to_tsvector('spanish', post.title || ' ' || teaser || ' ' || raw_text)"
  Generating /app/migrations/versions/673f9d0115e2_.py ...  done

The affected models look as follows (simplified):

class Chapter(BasePublication):
    part_id = db.Column(db.Integer, db.ForeignKey('part.id', ondelete='SET NULL'), index=True)
    classification = db.relationship('Part', backref=db.backref('publications', lazy='dynamic'))

    categories = db.relationship('Category', secondary='chapter_category',
                                 backref=db.backref('chapters', lazy='dynamic'), lazy='dynamic')

    __ts_vector__ = sa.func.to_tsvector(
        sa.literal('spanish'), sa.text("chapter.title || ' ' || teaser || ' ' || raw_text"))

    __table_args__ = (
        db.Index('ix_chapter_ts', __ts_vector__, postgresql_using='gin'),
        db.UniqueConstraint('number', 'fragment', name='unique_chapter_number_fragment'),
    )
class Post(BasePublication):
    topic_id = db.Column(db.Integer, db.ForeignKey('topic.id', ondelete='SET NULL'), index=True)
    classification = db.relationship('Topic', backref=db.backref('publications', lazy='dynamic'))

    categories = db.relationship(
        'Category', secondary='post_category', backref=db.backref('posts', lazy='dynamic'), lazy='dynamic')

    __ts_vector__ = sa.func.to_tsvector(
        sa.literal('spanish'), sa.text("post.title || ' ' || teaser || ' ' || raw_text"))

    __table_args__ = (
        db.Index('ix_post_ts', __ts_vector__, postgresql_using='gin'),
        db.UniqueConstraint('number', 'fragment', name='unique_post_number_fragment'),
    )
class Category(BaseModel):
    name = db.Column(db.String(50), nullable=False, unique=True)
    hashtag = db.Column(db.String(51), nullable=False, unique=True)

    __ts_vector__ = sa.func.to_tsvector(
        sa.literal('spanish'), sa.text("name || ' ' || hashtag"))

    __table_args__ = (
        db.Index('ix_category_ts', __ts_vector__, postgresql_using='gin'),
    )

One of the multiple generated migrations looks like follows:

def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    with op.batch_alter_table('category', schema=None) as batch_op:
        batch_op.drop_index('ix_category_ts', postgresql_using='gin')
        batch_op.create_index('ix_category_ts', [sa.text("to_tsvector('spanish', name || ' ' || hashtag)")], unique=False, postgresql_using='gin')

    with op.batch_alter_table('chapter', schema=None) as batch_op:
        batch_op.drop_index('ix_chapter_ts', postgresql_using='gin')
        batch_op.create_index('ix_chapter_ts', [sa.text("to_tsvector('spanish', chapter.title || ' ' || teaser || ' ' || raw_text)")], unique=False, postgresql_using='gin')

    with op.batch_alter_table('post', schema=None) as batch_op:
        batch_op.drop_index('ix_post_ts', postgresql_using='gin')
        batch_op.create_index('ix_post_ts', [sa.text("to_tsvector('spanish', post.title || ' ' || teaser || ' ' || raw_text)")], unique=False, postgresql_using='gin')

    # ### end Alembic commands ###


def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    with op.batch_alter_table('post', schema=None) as batch_op:
        batch_op.drop_index('ix_post_ts', postgresql_using='gin')
        batch_op.create_index('ix_post_ts', [sa.text("to_tsvector('spanish'::regconfig, (((title::text || ' '::text) || teaser::text) || ' '::text) || raw_text)")], unique=False, postgresql_using='gin')

    with op.batch_alter_table('chapter', schema=None) as batch_op:
        batch_op.drop_index('ix_chapter_ts', postgresql_using='gin')
        batch_op.create_index('ix_chapter_ts', [sa.text("to_tsvector('spanish'::regconfig, (((title::text || ' '::text) || teaser::text) || ' '::text) || raw_text)")], unique=False, postgresql_using='gin')

    with op.batch_alter_table('category', schema=None) as batch_op:
        batch_op.drop_index('ix_category_ts', postgresql_using='gin')
        batch_op.create_index('ix_category_ts', [sa.text("to_tsvector('spanish'::regconfig, (name::text || ' '::text) || hashtag::text)")], unique=False, postgresql_using='gin')

    # ### end Alembic commands ###

When I was using Alembic 1.7.6, SQLAlchemy 1.4.31, Flask-SQLAlchemy 2.5.1, Flask 2.0.3 and Python 3.10 this didn't happen as there was no support for this kind of indexes at that moment.

I hope this can get resolved soon.

Best regards.

@caumons caumons added the requires triage New issue that requires categorization label Jan 9, 2024
@CaselIT CaselIT added postgresql index expressions bug Something isn't working autogenerate - detection and removed requires triage New issue that requires categorization labels Jan 9, 2024
@CaselIT
Copy link
Member

CaselIT commented Jan 9, 2024

Hi,

Index expression compare was added to postgresql, and this seems to be an unsupported case.
For not you can ignore these indexes using name filter or object filters of alembic

@caumons
Copy link
Author

caumons commented Jan 9, 2024

Hi @CaselIT, thank you very much for your fast response!

I've just added the code of one of the generated migrations so you can see how they look like.

Yes, I'll ignore these 3 indexes while this case is unsupported.

Best regards.

@zzzeek
Copy link
Member

zzzeek commented Jan 9, 2024

@CaselIT should we have some doc notes about this specifically for indexes on PostgreSQL? since this is going to go on for a long time

@CaselIT
Copy link
Member

CaselIT commented Jan 9, 2024

Well we kinda said "next issue we get we place this under a config flag" so we could do that.

I think the regexp we use to remove the type cast used by pg is not detecting it in that complex case, so that should be fixed, but I guess it will not be the last issue we get

@CaselIT CaselIT added this to the 1.13 milestone Jan 10, 2024
@mvanderlee
Copy link

I'm running into this as well, but in my case it seems to related to Alembic not matching PSQL's cast expressions.
CAST(column AS TEXT) == column::text Yet Alembic sees them as different.

This migration is created every single time.

def upgrade() -> None:
    op.drop_index('idx_user_email_search_idx', table_name='user', postgresql_using='gin')
    op.create_index('idx_user_email_search_idx', 'user', [sa.text("to_tsvector(CAST('simple' AS REGCONFIG), CAST(coalesce(email, '') AS TEXT))")], unique=False, postgresql_using='gin')
    

def downgrade() -> None:
    op.drop_index('idx_user_email_search_idx', table_name='user', postgresql_using='gin')
    op.create_index('idx_user_email_search_idx', 'user', [sa.text("to_tsvector('simple'::regconfig, COALESCE(email, ''::character varying)::text)")], unique=False, postgresql_using='gin')
    

@CaselIT
Copy link
Member

CaselIT commented Jun 3, 2024

See also #1485

Shame there does not seem to be the equivalent of "NOT VALID" for indexes, otherwise we could have made pg compare the expression directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants