Skip to content

Querying Play By Play Data

Matthew Pope edited this page Mar 21, 2021 · 5 revisions

Play By Play

Play by play data is queryable, however it is a but cumbersome at the moment. A single row in the play_by_play table contains up to three player's participation. There is a event_message_type enum foreign key to a table that acts as an enum for high level descriptions. Here are the possible play classifications:

id string
1 FIELD_GOAL_MADE
2 FIELD_GOAL_MISSED
3 FREE_THROW
4 REBOUND
5 TURNOVER
6 FOUL
7 VIOLATION
8 SUBSTITUTION
9 TIMEOUT
10 JUMP_BALL
11 EJECTION
12 PERIOD_BEGIN
13 PERIOD_END
18 UNKNOWN

This can be used to query for specific plays. The following example is for querying for player_ids that were involved in rebound plays:

SELECT player_id1, player_id2, player_id3 FROM play_by_play
LEFT JOIN event_message_type ON play_by_play.event_msg_type = event_message_type.id
WHERE event_message_type.string = 'REBOUND'

The play_by_play table has three description fields, home_description, neutral_description, visitor_description. These can be queried using regex to get more specific information about matched plays. The nba_api documentation has a great write-up on how to do that.

Clone this wiki locally