Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Failure in Fetching Emails Using googleapiclient.discovery in Gmail API #2448

Open
shikharvaish28 opened this issue Jul 25, 2024 · 5 comments
Assignees

Comments

@shikharvaish28
Copy link

shikharvaish28 commented Jul 25, 2024

We are experiencing intermittent failures when attempting to fetch emails using the googleapiclient.discovery library with the Gmail API. Despite the presence of the emails in the inbox that match the query criteria, the API sometimes fails to retrieve them.

  1. Please note that this is an intermittent issue but recently I have been facing it a lot.
  2. I have this code running on 2 different ec2 instances with 2 different email IDs and the both of them face the issue intermittently

Steps to reproduce

  1. Use the googleapiclient.discovery library to fetch emails from the Gmail inbox.
  2. Query for unread emails from an email address and subject criteria.
  3. Observe the intermittent failures in fetching the emails

Version of libraries

Python 3.10.12

google-api-core==2.17.1
google-api-python-client==2.118.0
google-auth==2.28.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.0
googleapis-common-protos==1.62.0

Code example

Here is the code I have been using for a year, and it fails sometime

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)


def get_gmail_service():
    """
    The file token.json stores the user's access and refresh tokens, and is
    created automatically when the authorization flow completes for the first time.
    """
    scopes = ['https://mail.google.com/']
    creds = None
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', scopes)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', scopes)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.json', 'w') as token:
            token.write(creds.to_json())

    service = build('gmail', 'v1', credentials=creds, cache_discovery=False)
    return service


def delete_email():
    service = get_gmail_service()
    results = service.users().messages().list(
        userId='me',
        labelIds=['INBOX'],
        q="from:[email protected] subject:Transaction OTP"
    ).execute()
    message = results.get('messages', [])[0]

    if not message:
        logger.info('[delete_email] No messages from CDSL found to be deleted')
    else:
        logger.info("[delete_email] Found message from CDSL, deleting it...")
        service.users().messages().delete(userId='me', id=message['id']).execute()
        logger.info("[delete_email] Successfully delete message from CDSL")


def get_otp_from_email(tries=0) -> str:
    service = get_gmail_service()
    try:
        results = service.users().messages().list(
            userId='me',
            maxResults=1,
            labelIds=['INBOX'],
            q="from:[email protected] is:unread subject:Transaction OTP"
        ).execute()
        message = results.get('messages', [])[0]

        if not message:
            logger.warning('[get_otp] No messages from CDSL found.')
        else:
            msg = service.users().messages().get(userId='me', id=message['id']).execute()
            email_data = msg['payload']['headers']
            for values in email_data:
                name = values["name"]
                if name == "From":
                    from_name = values["value"]
                    match_opt = re.search('(\d{6})', str(msg['snippet']))
                    return match_opt.group()
    except Exception as exp:
        logger.warning("[get_otp] No messages from CDSL found to read", exp)

    if tries >= 5:
        logger.error("[get_otp] No email from edis even after %s tries", tries)
        exit()

    # If email is not received, wait for 15 seconds and try again
    while tries < 5:
        time.sleep(15)
        get_otp_from_email(tries=tries + 1)

Stack trace

Here is the truncated log of the day it failed

 
09:09:44.619
2024-07-25 03:39:44,619—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:09:45.083
2024-07-25 03:39:45,083—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json
 
09:10:00.423
2024-07-25 03:40:00,423—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:10:00.937
2024-07-25 03:40:00,937—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json
 
09:10:16.278
2024-07-25 03:40:16,278—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:10:16.817
2024-07-25 03:40:16,793—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json
 
09:10:32.134
2024-07-25 03:40:32,133—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:10:32.618
2024-07-25 03:40:32,618—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json
 
09:10:47.940
2024-07-25 03:40:47,940—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:10:48.557
2024-07-25 03:40:48,557—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json
 
09:11:03.944
2024-07-25 03:41:03,943—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
09:11:04.464
2024-07-25 03:41:04,464—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages/190e7f906d9a91b0?alt=json

Thanks!

@shikharvaish28
Copy link
Author

Hi @ohmayr , did you had a chance to check the issue? It is causing some real pain

@ohmayr
Copy link
Contributor

ohmayr commented Aug 9, 2024

Hi @shikharvaish28, thanks for reporting this and sharing the reproduction code. I see that you've also shared some logs for the requested requested URL.

Can you share the error / response that you're getting for a failed request?

@shikharvaish28
Copy link
Author

Hi @ohmayr, since I use the python library I don't have the response. Instead, I did add some more logging and here is what it got:


Aug 5, 2024 09:08:13.233
2024-08-05 03:38:13,233—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
Aug 5, 2024 09:08:13.636
2024-08-05 03:38:13,635—ERROR-edis.edis_utils — [get_otp] No messages from CDSL found. Message = []
 
Aug 5, 2024 09:08:13.787
2024-08-05 03:38:13,786—DEBUG-urllib3.connectionpool — https://o154087.ingest.sentry.io:443 "POST /api/XXXXXXXXXXXXXXXX/store/ HTTP/1.1" 200 41
 
Aug 5, 2024 09:08:24.414
2024-08-05 03:38:24,413—DEBUG-google.auth.transport.requests — Making request: POST https://oauth2.googleapis.com/token
 
Aug 5, 2024 09:08:24.414
2024-08-05 03:38:24,414—DEBUG-urllib3.connectionpool — Starting new HTTPS connection (1): oauth2.googleapis.com:443
 
Aug 5, 2024 09:08:24.639
2024-08-05 03:38:24,639—DEBUG-urllib3.connectionpool — https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None
 
Aug 5, 2024 09:08:24.816
2024-08-05 03:38:24,655—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
Aug 5, 2024 09:08:25.141
2024-08-05 03:38:25,141—ERROR-edis.edis_utils — [get_otp] No messages from CDSL found. Message = []
 
Aug 5, 2024 09:08:25.151
2024-08-05 03:38:25,150—DEBUG-urllib3.connectionpool — Starting new HTTPS connection (1): o154087.ingest.sentry.io:443
 
Aug 5, 2024 09:08:25.371
2024-08-05 03:38:25,371—DEBUG-urllib3.connectionpool — https://o154087.ingest.sentry.io:443 "POST /api/XXXXXXXXXXXXXXXX/store/ HTTP/1.1" 200 41
 
Aug 5, 2024 09:08:28.659
2024-08-05 03:38:28,659—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
Aug 5, 2024 09:08:29.183
2024-08-05 03:38:29,054—ERROR-edis.edis_utils — [get_otp] No messages from CDSL found. Message = []
 
Aug 5, 2024 09:08:29.216
2024-08-05 03:38:29,216—DEBUG-urllib3.connectionpool — https://o154087.ingest.sentry.io:443 "POST /api/XXXXXXXXXXXXXXXX/store/ HTTP/1.1" 200 41
 
Aug 5, 2024 09:08:40.167
2024-08-05 03:38:40,167—DEBUG-googleapiclient.discovery — URL being requested: GET https://gmail.googleapis.com/gmail/v1/users/me/messages?maxResults=1&labelIds=INBOX&q=from%3Aedis%40cdslindia.co.in+is%3Aunread+subject%3ATransaction+OTP&alt=json
 
Aug 5, 2024 09:08:40.813
2024-08-05 03:38:40,782—ERROR-edis.edis_utils — [get_otp] No messages from CDSL found. Message = []
def get_otp_from_email(tries=0) -> str:
    service = get_gmail_service()
    try:
        results = service.users().messages().list(
            userId='me',
            maxResults=1,
            labelIds=['INBOX'],
            q="from:[email protected] is:unread subject:Transaction OTP"
        ).execute()
        message = results.get('messages', [])[0]

        if not message:
            logger.warning('[get_otp] No messages from CDSL found.')
        else:
            msg = service.users().messages().get(userId='me', id=message['id']).execute()
            email_data = msg['payload']['headers']
            for values in email_data:
                name = values["name"]
                if name == "From":
                    from_name = values["value"]
                    match_opt = re.search('(\d{6})', str(msg['snippet']))
                    return match_opt.group()
    except Exception as exp:
        logger.warning("[get_otp] No messages from CDSL found to read", exp)

    if tries >= 5:
        logger.error("[get_otp] No email from edis even after %s tries", tries)
        exit()

    # If email is not received, wait for 15 seconds and try again
    while tries < 5:
        time.sleep(15)
        get_otp_from_email(tries=tries + 1)

Please suggest if I need to add something to get the response

@ohmayr
Copy link
Contributor

ohmayr commented Aug 9, 2024

@shikharvaish28 I ran the provided script with my own query parameters and i'm able to successfully get the message.

I can't say for sure why it isn't working in your case since you mentioned that it only happens sometimes. Try tinkering with the query parameters and see if it works?

Would help if you have a concrete example to reproduce this.

@shikharvaish28
Copy link
Author

shikharvaish28 commented Aug 9, 2024

@ohmayr so that's precisely the problem - the message fetch fails sometimes, but not always(however it does happen atleast thrice a week, and the code runs only once a day). This happens across two tenants with different accounts. The code runs simultaneously on both instances, but usually, only one fails while the other succeeds.

Can you help me add debug logs to capture the exception/response so that I can share the same when the message fetch fails?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants