Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a format for a2 hosting log #342

Open
Chardonneaur opened this issue Jan 17, 2023 · 5 comments
Open

Create a format for a2 hosting log #342

Chardonneaur opened this issue Jan 17, 2023 · 5 comments

Comments

@Chardonneaur
Copy link

The format of A2 hosting log are quite similar to OVH, here is an example of A2 hosting log format:
54.0.0.0 - - [01/Dec/2022:08:31:01 -0500] "GET / HTTP/1.1" 200 304 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"

@sgiehl
Copy link
Member

sgiehl commented Jan 17, 2023

Isn't A2 using some default log format? Did you try letting the log parser detect the format?

@Chardonneaur
Copy link
Author

I did, but it was written url are undefined

@sgiehl
Copy link
Member

sgiehl commented Jan 17, 2023

In that case, guess it should work by providing a custom regex.
Unfortunately I don't think we will be able to work on any improvements for the log importer any time soon.

@Chardonneaur
Copy link
Author

could you guide me to write this custom regex?

@sgiehl
Copy link
Member

sgiehl commented Jan 18, 2023

Those are some of the current formats, hope that helps to create a custom one:

_COMMON_LOG_FORMAT = (
r'(?P<ip>[\w*.:-]+)\s+\S+\s+(?P<userid>\S+)\s+\[(?P<date>.*?)\s+(?P<timezone>.*?)\]\s+'
r'"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+(?P<length>\S+)'
)
_NCSA_EXTENDED_LOG_FORMAT = (_COMMON_LOG_FORMAT +
r'\s+"(?P<referrer>.*?)"\s+"(?P<user_agent>.*?)"'
)
_S3_LOG_FORMAT = (
r'\S+\s+(?P<host>\S+)\s+\[(?P<date>.*?)\s+(?P<timezone>.*?)\]\s+(?P<ip>[\w*.:-]+)\s+'
r'(?P<userid>\S+)\s+\S+\s+\S+\s+\S+\s+"(?P<method>\S+)\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\d+)\s+\S+\s+(?P<length>\S+)\s+'
r'\S+\s+\S+\s+\S+\s+"(?P<referrer>.*?)"\s+"(?P<user_agent>.*?)"'
)
_ICECAST2_LOG_FORMAT = ( _NCSA_EXTENDED_LOG_FORMAT +
r'\s+(?P<session_time>[0-9-]+)'
)
_ELB_LOG_FORMAT = (
r'(?:\S+\s+)?(?P<date>[0-9-]+T[0-9:]+)\.\S+\s+\S+\s+(?P<ip>[\w*.:-]+):\d+\s+\S+:\d+\s+\S+\s+(?P<generation_time_secs>\S+)\s+\S+\s+'
r'(?P<status>\d+)\s+\S+\s+\S+\s+(?P<length>\S+)\s+'
r'"\S+\s+\w+:\/\/(?P<host>[\w\-\.]*):\d+(?P<path>\/\S*)\s+[^"]+"\s+"(?P<user_agent>[^"]+)"\s+\S+\s+\S+'
)
_OVH_FORMAT = (
r'(?P<ip>\S+)\s+' + _HOST_PREFIX + r'(?P<userid>\S+)\s+\[(?P<date>.*?)\s+(?P<timezone>.*?)\]\s+'
r'"\S+\s+(?P<path>.*?)\s+\S+"\s+(?P<status>\S+)\s+(?P<length>\S+)'
r'\s+"(?P<referrer>.*?)"\s+"(?P<user_agent>.*?)"'
)
_HAPROXY_FORMAT = (
r'.*:\ (?P<ip>[\w*.]+).*\[(?P<date>.*)\].*\ (?P<status>\b\d{3}\b)\ (?P<length>\d+)\ -.*\"(?P<method>\S+)\ (?P<path>\S+).*'
)
_GANDI_SIMPLE_HOSTING_FORMAT = (
r'(?P<host>[0-9a-zA-Z-_.]+)\s+(?P<ip>[a-zA-Z0-9.]+)\s+\S+\s+(?P<userid>\S+)\s+\[(?P<date>.+?)\s+(?P<timezone>.+?)\]\s+\((?P<generation_time_secs>[0-9a-zA-Z\s]*)\)\s+"(?P<method>[A-Z]+)\s+(?P<path>\S+)\s+(\S+)"\s+(?P<status>[0-9]+)\s+(?P<length>\S+)\s+"(?P<referrer>\S+)"\s+"(?P<user_agent>[^"]+)"'
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants