GITHUB
This document explains how to ingest GitHub audit logs to Google Security Operations using Amazon S3. The parser attempts to extract data from the "message" field using various grok patterns, handling both JSON and non-JSON formats. Based on the extracted "process_type", it applies specific parsing logic using grok, kv, and other filters to map the raw log data into the Unified Data Model (UDM) schema.
Before you begin
Make sure you have the following prerequisites:
- Google SecOps instance.
- Privileged access to GitHub Enterprise Cloudtenant with enterprise owner permissions.
- Privileged access to AWS(S3, IAM).
Collect GitHub Enterprise Cloud prerequisites (Enterprise access)
- Sign in to the GitHub Enterprise Cloud Admin Console.
- Go to Enterprise settings > Settings > Audit log > Log streaming.
- Make sure you have enterprise owner permissions to configure audit log streaming.
- Copy and save in a secure location the following details:
- GitHub Enterprise name
- Organization namesunder the enterprise
Configure AWS S3 bucket and Identity and Access Management for Google SecOps
- Create Amazon S3 bucketfollowing this user guide: Creating a bucket
- Save bucket Nameand Regionfor future reference (for example,
github-audit-logs
). - Create a Userfollowing this user guide: Creating an IAM user .
- Select the created User.
- Select Security credentialstab.
- Click Create Access Keyin the Access Keyssection.
- Select Third-party serviceas Use case.
- Click Next.
- Optional: Add a description tag.
- Click Create access key.
- Click Download .CSV fileto save the Access Keyand Secret Access Keyfor future reference.
- Click Done.
Configure the IAM policy for GitHub S3 streaming
- In the AWS console, go to IAM > Policies > Create policy > JSON tab.
- Copy and paste the following policy.
-
Policy JSON(replace
github-audit-logs
if you entered a different bucket name):{ "Version" : "2012-10-17" , "Statement" : [ { "Sid" : "AllowPutObjects" , "Effect" : "Allow" , "Action" : "s3:PutObject" , "Resource" : "arn:aws:s3:::github-audit-logs/*" } ] }
-
Click Next > Create policy.
-
Name the policy
GitHubAuditStreamingPolicy
and click Create policy. -
Go back to the IAM user created earlier.
-
Select the Permissionstab.
-
Click Add permissions > Attach policies directly.
-
Search for and select
GitHubAuditStreamingPolicy
. -
Click Next > Add permissions.
Configure GitHub Enterprise Cloud audit log streaming
- Sign in to GitHub Enterprise Cloudas an enterprise owner.
- Click your profile photo, then click Enterprise settings.
- In the enterprise account sidebar, click Settings > Audit log > Log streaming.
- Select Configure streamand click Amazon S3.
- Under Authentication, click Access keys.
- Provide the following configuration details:
- Region: Select the bucket's region (for example,
us-east-1
). - Bucket: Type the name of the bucket you want to stream to (for example,
github-audit-logs
). - Access Key ID: Enter your access key ID from the IAM user.
- Secret Key: Enter your secret key from the IAM user.
- Region: Select the bucket's region (for example,
- Click Check endpointto verify that GitHub can connect and write to the Amazon S3 endpoint.
- After you've successfully verified the endpoint, click Save.
Create read-only IAM user & keys for Google SecOps
- Go to AWS Console > IAM > Users > Add users.
- Click Add users.
- Provide the following configuration details:
- User: Enter
secops-reader
. - Access type: Select Access key – Programmatic access.
- User: Enter
- Click Create user.
- Attach the minimal read policy (custom): Users > secops-reader > Permissions > Add permissions > Attach policies directly > Create policy.
-
JSON:
{ "Version" : "2012-10-17" , "Statement" : [ { "Effect" : "Allow" , "Action" : [ "s3:GetObject" ], "Resource" : "arn:aws:s3:::github-audit-logs/*" }, { "Effect" : "Allow" , "Action" : [ "s3:ListBucket" ], "Resource" : "arn:aws:s3:::github-audit-logs" } ] }
-
Name =
secops-reader-policy
. -
Click Create policy > search/select > Next > Add permissions.
-
Create an access key for
secops-reader
: Security credentials > Access keys > Create access key > download the.CSV
(you'll paste these values into the feed).
Configure a feed in Google SecOps to ingest GitHub logs
- Go to SIEM Settings > Feeds.
- Click + Add New Feed.
- In the Feed namefield, enter a name for the feed (for example,
GitHub audit logs
). - Select Amazon S3 V2as the Source type.
- Select GitHubas the Log type.
- Click Next.
- Specify values for the following input parameters:
- S3 URI:
s3://github-audit-logs/
- Source deletion options: Select deletion option according to your preference.
- Maximum File Age: Include files modified in the last number of days. Default is 180 days.
- Access Key ID: User access key with access to the S3 bucket.
- Secret Access Key: User secret key with access to the S3 bucket.
- Asset namespace: the asset namespace .
- Ingestion labels: the label applied to the events from this feed.
- S3 URI:
- Click Next.
- Review your new feed configuration in the Finalizescreen, and then click Submit.
UDM mapping table
Log Field | UDM Mapping | Logic |
---|---|---|
actor
|
principal.user.userid
|
The value is taken from the actor
field. |
actor_id
|
principal.user.attribute.labels.value
|
The value is taken from the actor_id
field. |
actor_ip
|
principal.ip
|
The value is taken from the actor_ip
field. |
actor_location.country_code
|
principal.location.country_or_region
|
The value is taken from the actor_location.country_code
field. |
application_name
|
target.application
|
The value is taken from the application_name
field. |
business
|
target.user.company_name
|
The value is taken from the business
field. |
business_id
|
target.resource.attribute.labels.value
|
The value is taken from the business_id
field. |
config.url
|
target.url
|
The value is taken from the config.url
field. |
created_at
|
metadata.event_timestamp
|
The value is converted from UNIX milliseconds to a timestamp. |
data.cancelled_at
|
extensions.vulns.vulnerabilities.scan_end_time
|
The value is converted from ISO8601 format to a timestamp. |
data.email
|
target.email
|
The value is taken from the data.email
field. |
data.event
|
security_result.about.labels.value
|
The value is taken from the data.event
field. |
data.events
|
security_result.about.labels.value
|
The value is taken from the data.events
field. |
data.head_branch
|
security_result.about.labels.value
|
The value is taken from the data.head_branch
field. |
data.head_sha
|
target.file.sha256
|
The value is taken from the data.head_sha
field. |
data.hook_id
|
target.resource.attribute.labels.value
|
The value is taken from the data.hook_id
field. |
data.started_at
|
extensions.vulns.vulnerabilities.scan_start_time
|
The value is converted from ISO8601 format to a timestamp. |
data.team
|
target.user.group_identifiers
|
The value is taken from the data.team
field. |
data.trigger_id
|
security_result.about.labels.value
|
The value is taken from the data.trigger_id
field. |
data.workflow_id
|
security_result.about.labels.value
|
The value is taken from the data.workflow_id
field. |
data.workflow_run_id
|
security_result.about.labels.value
|
The value is taken from the data.workflow_run_id
field. |
enterprise.name
|
additional.fields.value.string_value
|
The value is taken from the enterprise.name
field. |
external_identity_nameid
|
target.user.email_addresses
|
If the value is an email address, it is added to the target.user.email_addresses
array. |
external_identity_nameid
|
target.user.userid
|
The value is taken from the external_identity_nameid
field. |
external_identity_username
|
target.user.user_display_name
|
The value is taken from the external_identity_username
field. |
hashed_token
|
network.session_id
|
The value is taken from the hashed_token
field. |
job_name
|
target.resource.attribute.labels.value
|
The value is taken from the job_name
field. |
job_workflow_ref
|
target.resource.attribute.labels.value
|
The value is taken from the job_workflow_ref
field. |
org
|
target.administrative_domain
|
The value is taken from the org
field. |
org_id
|
additional.fields.value.string_value
|
The value is taken from the org_id
field. |
programmatic_access_type
|
additional.fields.value.string_value
|
The value is taken from the programmatic_access_type
field. |
public_repo
|
additional.fields.value.string_value
|
The value is taken from the public_repo
field. |
public_repo
|
target.location.name
|
If the value is "false", it is mapped to "PRIVATE". Otherwise, it is mapped to "PUBLIC". |
query_string
|
additional.fields.value.string_value
|
The value is taken from the query_string
field. |
rate_limit_remaining
|
additional.fields.value.string_value
|
The value is taken from the rate_limit_remaining
field. |
repo
|
target.resource.name
|
The value is taken from the repo
field. |
repo_id
|
additional.fields.value.string_value
|
The value is taken from the repo_id
field. |
repository_public
|
additional.fields.value.string_value
|
The value is taken from the repository_public
field. |
request_body
|
additional.fields.value.string_value
|
The value is taken from the request_body
field. |
request_method
|
network.http.method
|
The value is converted to uppercase. |
route
|
additional.fields.value.string_value
|
The value is taken from the route
field. |
status_code
|
network.http.response_code
|
The value is converted to an integer. |
timestamp
|
metadata.event_timestamp
|
The value is converted from UNIX milliseconds to a timestamp. |
token_id
|
additional.fields.value.string_value
|
The value is taken from the token_id
field. |
token_scopes
|
additional.fields.value.string_value
|
The value is taken from the token_scopes
field. |
transport_protocol_name
|
network.application_protocol
|
The value is converted to uppercase. |
url_path
|
target.url
|
The value is taken from the url_path
field. |
user
|
target.user.user_display_name
|
The value is taken from the user
field. |
user_agent
|
network.http.user_agent
|
The value is taken from the user_agent
field. |
user_agent
|
network.http.parsed_user_agent
|
The value is parsed. |
user_id
|
target.user.userid
|
The value is taken from the user_id
field. |
workflow.name
|
security_result.about.labels.value
|
The value is taken from the workflow.name
field. |
workflow_run.actor.login
|
principal.user.userid
|
The value is taken from the workflow_run.actor.login
field. |
workflow_run.event
|
additional.fields.value.string_value
|
The value is taken from the workflow_run.event
field. |
workflow_run.head_branch
|
security_result.about.labels.value
|
The value is taken from the workflow_run.head_branch
field. |
workflow_run.head_sha
|
target.file.sha256
|
The value is taken from the workflow_run.head_sha
field. |
workflow_run.id
|
target.resource.attribute.labels.value
|
The value is taken from the workflow_run.id
field. |
workflow_run.workflow_id
|
security_result.about.labels.value
|
The value is taken from the workflow_run.workflow_id
field. |
N/A
|
metadata.event_type
|
The value is determined based on the action
and actor
fields. If the action
field contains "_member", the value is set to "USER_RESOURCE_UPDATE_PERMISSIONS". If the action
field is not empty and the actor
field is not empty, the value is set to "USER_RESOURCE_UPDATE_CONTENT". Otherwise, the value is set to "USER_RESOURCE_ACCESS". |
N/A
|
metadata.log_type
|
The value is set to "GITHUB". |
N/A
|
metadata.product_name
|
The value is set to "GITHUB". |
N/A
|
metadata.vendor_name
|
The value is set to "GITHUB". |
N/A
|
target.resource.resource_type
|
The value is set to "STORAGE_OBJECT". |
N/A
|
security_result.about.labels.key
|
The value is set to a constant string based on the corresponding data
field. For example, for data.workflow_id
, the key is set to "Workflow Id". |
N/A
|
target.resource.attribute.labels.key
|
The value is set to a constant string based on the corresponding data
field. For example, for data.hook_id
, the key is set to "Hook Id". |
Need more help? Get answers from Community members and Google SecOps professionals.