Collect Box Collaboration JSON logs

Supported in:

This document explains how to ingest Box Collaboration JSON logs to Google Security Operations using AWS S3using Lambdaand EventBridgeschedule. The parser processes Box event logs in JSON format, mapping them to a unified data model (UDM). It extracts relevant fields from the raw logs, performs data transformations like renaming and merging, and enriches the data with intermediary information before outputting the structured event data.

Before you begin

  • Google SecOps instance
  • Privileged access to Box (Admin + Developer Console)
  • Privileged access to AWS (S3, IAM, Lambda, EventBridge) in the same Region where you plan to store the logs

Configure Box Developer Console (Client Credentials)

  1. Sign in to Box Developer Console.
  2. Create a Custom Appwith Server Authentication (Client Credentials Grant).
  3. Set Application Access= App + Enterprise Access.
  4. In Application Scopes, enable Manage enterprise properties.
  5. In Admin Console > Apps > Custom Apps Manager, Authorizethe app by Client ID.
  6. Copy and save the Client IDand * Client Secretin a secure location.
  7. Go to Admin Console > Account & Billing > Account Information.
  8. Copy and save the Enterprise IDin a secure location.

Configure AWS S3 bucket and IAM for Google SecOps

  1. Create Amazon S3 bucketfollowing this user guide: Creating a bucket
  2. Save bucket Nameand Regionfor future reference (for example, box-collaboration-logs ).
  3. Create a user following this user guide: Creating an IAM user .
  4. Select the created User.
  5. Select the Security credentialstab.
  6. Click Create Access Keyin the Access Keyssection.
  7. Select Third-party serviceas the Use case.
  8. Click Next.
  9. Optional: add a description tag.
  10. Click Create access key.
  11. Click Download CSV fileto save the Access Keyand Secret Access Keyfor later use.
  12. Click Done.
  13. Select the Permissionstab.
  14. Click Add permissionsin the Permissions policiessection.
  15. Select Add permissions.
  16. Select Attach policies directly
  17. Search for and select the AmazonS3FullAccesspolicy.
  18. Click Next.
  19. Click Add permissions.

Configure the IAM policy and role for S3 uploads

  1. In the AWS console, go to IAM > Policies > Create policy > JSON tab.
  2. Enter the following policy:

      { 
      
     "Version" 
     : 
      
     "2012-10-17" 
     , 
      
     "Statement" 
     : 
      
     [ 
      
     { 
      
     "Sid" 
     : 
      
     "AllowPutBoxObjects" 
     , 
      
     "Effect" 
     : 
      
     "Allow" 
     , 
      
     "Action" 
     : 
      
     [ 
     "s3:PutObject" 
     ], 
      
     "Resource" 
     : 
      
     "arn:aws:s3:::box-collaboration-logs/*" 
      
     }, 
      
     { 
      
     "Sid" 
     : 
      
     "AllowGetStateObject" 
     , 
      
     "Effect" 
     : 
      
     "Allow" 
     , 
      
     "Action" 
     : 
      
     [ 
     "s3:GetObject" 
     ], 
      
     "Resource" 
     : 
      
     "arn:aws:s3:::box-collaboration-logs/box/collaboration/state.json" 
      
     } 
      
     ] 
     } 
     
    
    • Replace box-collaboration-logs if you entered a different bucket name.
  3. Click Next > Create policy.

  4. Go to IAM > Roles > Create role > AWS service > Lambda.

  5. Attach the newly created policy.

  6. Name the role WriteBoxToS3Role and click Create role.

Create the Lambda function

  1. In the AWS Console, go to Lambda > Functions > Create function.
  2. Click Author from scratch.
  3. Provide the following configuration details:

    Setting Value
    Name box_collaboration_to_s3
    Runtime Python 3.13
    Architecture x86_64
    Execution role WriteBoxToS3Role
  4. After the function is created, open the Codetab, delete the stub and enter the following code ( box_collaboration_to_s3.py ):

      #!/usr/bin/env python3 
     # Lambda: Pull Box Enterprise Events to S3 (no transform) 
     import 
      
     os 
     , 
      
     json 
     , 
      
     time 
     , 
      
     urllib.parse 
     from 
      
     urllib.request 
      
     import 
     Request 
     , 
     urlopen 
     from 
      
     urllib.error 
      
     import 
     HTTPError 
     , 
     URLError 
     import 
      
     boto3 
     TOKEN_URL 
     = 
     "https://api.box.com/oauth2/token" 
     EVENTS_URL 
     = 
     "https://api.box.com/2.0/events" 
     CID 
     = 
     os 
     . 
     environ 
     [ 
     "BOX_CLIENT_ID" 
     ] 
     CSECRET 
     = 
     os 
     . 
     environ 
     [ 
     "BOX_CLIENT_SECRET" 
     ] 
     ENT_ID 
     = 
     os 
     . 
     environ 
     [ 
     "BOX_ENTERPRISE_ID" 
     ] 
     STREAM_TYPE 
     = 
     os 
     . 
     environ 
     . 
     get 
     ( 
     "STREAM_TYPE" 
     , 
     "admin_logs_streaming" 
     ) 
     LIMIT 
     = 
     int 
     ( 
     os 
     . 
     environ 
     . 
     get 
     ( 
     "LIMIT" 
     , 
     "500" 
     )) 
     BUCKET 
     = 
     os 
     . 
     environ 
     [ 
     "S3_BUCKET" 
     ] 
     PREFIX 
     = 
     os 
     . 
     environ 
     . 
     get 
     ( 
     "S3_PREFIX" 
     , 
     "box/collaboration/" 
     ) 
     STATE_KEY 
     = 
     os 
     . 
     environ 
     . 
     get 
     ( 
     "STATE_KEY" 
     , 
     "box/collaboration/state.json" 
     ) 
     s3 
     = 
     boto3 
     . 
     client 
     ( 
     "s3" 
     ) 
     def 
      
     get_state 
     (): 
     try 
     : 
     obj 
     = 
     s3 
     . 
     get_object 
     ( 
     Bucket 
     = 
     BUCKET 
     , 
     Key 
     = 
     STATE_KEY 
     ) 
     data 
     = 
     json 
     . 
     loads 
     ( 
     obj 
     [ 
     "Body" 
     ] 
     . 
     read 
     ()) 
     return 
     data 
     . 
     get 
     ( 
     "stream_position" 
     ) 
     except 
     Exception 
     : 
     return 
     None 
     def 
      
     put_state 
     ( 
     pos 
     ): 
     body 
     = 
     json 
     . 
     dumps 
     ({ 
     "stream_position" 
     : 
     pos 
     }, 
     separators 
     = 
     ( 
     "," 
     , 
     ":" 
     )) 
     . 
     encode 
     ( 
     "utf-8" 
     ) 
     s3 
     . 
     put_object 
     ( 
     Bucket 
     = 
     BUCKET 
     , 
     Key 
     = 
     STATE_KEY 
     , 
     Body 
     = 
     body 
     , 
     ContentType 
     = 
     "application/json" 
     ) 
     def 
      
     get_token 
     (): 
     body 
     = 
     urllib 
     . 
     parse 
     . 
     urlencode 
     ({ 
     "grant_type" 
     : 
     "client_credentials" 
     , 
     "client_id" 
     : 
     CID 
     , 
     "client_secret" 
     : 
     CSECRET 
     , 
     "box_subject_type" 
     : 
     "enterprise" 
     , 
     "box_subject_id" 
     : 
     ENT_ID 
     , 
     }) 
     . 
     encode 
     () 
     req 
     = 
     Request 
     ( 
     TOKEN_URL 
     , 
     data 
     = 
     body 
     , 
     method 
     = 
     "POST" 
     ) 
     req 
     . 
     add_header 
     ( 
     "Content-Type" 
     , 
     "application/x-www-form-urlencoded" 
     ) 
     with 
     urlopen 
     ( 
     req 
     , 
     timeout 
     = 
     30 
     ) 
     as 
     r 
     : 
     tok 
     = 
     json 
     . 
     loads 
     ( 
     r 
     . 
     read 
     () 
     . 
     decode 
     ()) 
     return 
     tok 
     [ 
     "access_token" 
     ] 
     def 
      
     fetch_events 
     ( 
     token 
     , 
     stream_position 
     = 
     None 
     , 
     timeout 
     = 
     60 
     , 
     max_retries 
     = 
     5 
     ): 
     params 
     = 
     { 
     "stream_type" 
     : 
     STREAM_TYPE 
     , 
     "limit" 
     : 
     LIMIT 
     , 
     "stream_position" 
     : 
     stream_position 
     or 
     "now" 
     } 
     qs 
     = 
     urllib 
     . 
     parse 
     . 
     urlencode 
     ( 
     params 
     ) 
     attempt 
     , 
     backoff 
     = 
     0 
     , 
     1.0 
     while 
     True 
     : 
     try 
     : 
     req 
     = 
     Request 
     ( 
     f 
     " 
     { 
     EVENTS_URL 
     } 
     ? 
     { 
     qs 
     } 
     " 
     , 
     method 
     = 
     "GET" 
     ) 
     req 
     . 
     add_header 
     ( 
     "Authorization" 
     , 
     f 
     "Bearer 
     { 
     token 
     } 
     " 
     ) 
     with 
     urlopen 
     ( 
     req 
     , 
     timeout 
     = 
     timeout 
     ) 
     as 
     r 
     : 
     return 
     json 
     . 
     loads 
     ( 
     r 
     . 
     read 
     () 
     . 
     decode 
     ()) 
     except 
     HTTPError 
     as 
     e 
     : 
     if 
     e 
     . 
     code 
     == 
     429 
     and 
     attempt 
    < max_retries 
     : 
     ra 
     = 
     e 
     . 
     headers 
     . 
     get 
     ( 
     "Retry-After" 
     ) 
     delay 
     = 
     int 
     ( 
     ra 
     ) 
     if 
     ( 
     ra 
     and 
     ra 
     . 
     isdigit 
     ()) 
     else 
     int 
     ( 
     backoff 
     ) 
     time 
     . 
     sleep 
     ( 
     max 
     ( 
     1 
     , 
     delay 
     )); 
     attempt 
     += 
     1 
     ; 
     backoff 
     *= 
     2 
     ; 
     continue 
     if 
     500 
    < = 
     e 
     . 
     code 
    < = 
     599 
     and 
     attempt 
    < max_retries 
     : 
     time 
     . 
     sleep 
     ( 
     backoff 
     ); 
     attempt 
     += 
     1 
     ; 
     backoff 
     *= 
     2 
     ; 
     continue 
     raise 
     except 
     URLError 
     : 
     if 
     attempt 
    < max_retries 
     : 
     time 
     . 
     sleep 
     ( 
     backoff 
     ); 
     attempt 
     += 
     1 
     ; 
     backoff 
     *= 
     2 
     ; 
     continue 
     raise 
     def 
      
     write_chunk 
     ( 
     data 
     ): 
     ts 
     = 
     time 
     . 
     strftime 
     ( 
     "%Y/%m/ 
     %d 
     /%H%M%S" 
     , 
     time 
     . 
     gmtime 
     ()) 
     key 
     = 
     f 
     " 
     { 
     PREFIX 
     } 
     / 
     { 
     ts 
     } 
     -box-events.json" 
     s3 
     . 
     put_object 
     ( 
     Bucket 
     = 
     BUCKET 
     , 
     Key 
     = 
     key 
     , 
     Body 
     = 
     json 
     . 
     dumps 
     ( 
     data 
     , 
     separators 
     = 
     ( 
     "," 
     , 
     ":" 
     )) 
     . 
     encode 
     ( 
     "utf-8" 
     ), 
     ContentType 
     = 
     "application/json" 
     ) 
     return 
     key 
     def 
      
     lambda_handler 
     ( 
     event 
     = 
     None 
     , 
     context 
     = 
     None 
     ): 
     token 
     = 
     get_token 
     () 
     pos 
     = 
     get_state 
     () 
     total 
     , 
     idx 
     = 
     0 
     , 
     0 
     while 
     True 
     : 
     page 
     = 
     fetch_events 
     ( 
     token 
     , 
     pos 
     ) 
     entries 
     = 
     page 
     . 
     get 
     ( 
     "entries" 
     ) 
     or 
     [] 
     if 
     not 
     entries 
     : 
     next_pos 
     = 
     page 
     . 
     get 
     ( 
     "next_stream_position" 
     ) 
     or 
     pos 
     if 
     next_pos 
     and 
     next_pos 
     != 
     pos 
     : 
     put_state 
     ( 
     next_pos 
     ) 
     break 
     # уникальный ключ 
     ts 
     = 
     time 
     . 
     strftime 
     ( 
     "%Y/%m/ 
     %d 
     /%H%M%S" 
     , 
     time 
     . 
     gmtime 
     ()) 
     key 
     = 
     f 
     " 
     { 
     PREFIX 
     } 
     / 
     { 
     ts 
     } 
     -box-events- 
     { 
     idx 
     : 
     03d 
     } 
     .json" 
     s3 
     . 
     put_object 
     ( 
     Bucket 
     = 
     BUCKET 
     , 
     Key 
     = 
     key 
     , 
     Body 
     = 
     json 
     . 
     dumps 
     ( 
     page 
     , 
     separators 
     = 
     ( 
     "," 
     , 
     ":" 
     )) 
     . 
     encode 
     ( 
     "utf-8" 
     ), 
     ContentType 
     = 
     "application/json" 
     ) 
     idx 
     += 
     1 
     total 
     += 
     len 
     ( 
     entries 
     ) 
     pos 
     = 
     page 
     . 
     get 
     ( 
     "next_stream_position" 
     ) 
     or 
     pos 
     if 
     pos 
     : 
     put_state 
     ( 
     pos 
     ) 
     if 
     len 
     ( 
     entries 
     ) 
    < LIMIT 
     : 
     break 
     return 
     { 
     "ok" 
     : 
     True 
     , 
     "written" 
     : 
     total 
     , 
     "next_stream_position" 
     : 
     pos 
     } 
     
    
  5. Go to Configuration > Environment variables > Edit > Add new environment variable.

  6. Enter the following environment variables, replacing with your values:

    Key Example
    S3_BUCKET box-collaboration-logs
    S3_PREFIX box/collaboration/
    STATE_KEY box/collaboration/state.json
    BOX_CLIENT_ID Enter Box Client ID
    BOX_CLIENT_SECRET Enter Box Client Secret
    BOX_ENTERPRISE_ID Enter Box Enterprise ID
    STREAM_TYPE admin_logs_streaming
    LIMIT 500
  7. After the function is created, stay on its page (or open Lambda > Functions > your-function).

  8. Select the Configurationtab.

  9. In the General configurationpanel, click Edit.

  10. Change Timeoutto 10 minutes (600 seconds)and click Save.

Schedule the Lambda function (EventBridge Scheduler)

  1. Go to Amazon EventBridge > Scheduler > Create schedule.
  2. Provide the following configuration details:
    • Recurring schedule: Rate( 15 min ).
    • Target: your Lambda function.
    • Name: box-collaboration-schedule-15min .
  3. Click Create schedule.

Configure a feed in Google SecOps to ingest Box logs

  1. Go to SIEM Settings > Feeds.
  2. Click Add New Feed.
  3. In the Feed namefield, enter a name for the feed (for example, Box Collaboration ).
  4. Select Amazon S3 V2as the Source type.
  5. Select Boxas the Log type.
  6. Click Next.
  7. Specify values for the following input parameters:
    • S3 URI: The bucket URI (the format should be: s3://box-collaboration-logs/box/collaboration/ ). Replace box-collaboration-logs : Use the actual name of the bucket.
    • Source deletion options: Select the deletion option according to your preference.
    • Maximum File Age: Include files modified in the last number of days. Default is 180 Days.
    • Access Key ID: User access key with access to the S3 bucket.
    • Secret Access Key: User secret key with access to the S3 bucket.
    • Asset namespace: The asset namespace .
    • Ingestion labels: The label to be applied to the events from this feed.
  8. Click Next.
  9. Review your new feed configuration in the Finalizescreen, and then click Submit.

UDM Mapping Table

Log field UDM mapping Logic
additional_details.ekm_id
additional.fields Value taken from additional_details.ekm_id
additional_details.service_id
additional.fields Value taken from additional_details.service_id
additional_details.service_name
additional.fields Value taken from additional_details.service_name
additional_details.shared_link_id
additional.fields Value taken from additional_details.shared_link_id
additional_details.size
target.file.size Value taken from additional_details.size
additional_details.version_id
additional.fields Value taken from additional_details.version_id
created_at
metadata.event_timestamp Value taken from created_at
created_by.id
principal.user.userid Value taken from created_by.id
created_by.login
principal.user.email_addresses Value taken from created_by.login
created_by.name
principal.user.user_display_name Value taken from created_by.name
event_id
metadata.product_log_id Value taken from event_id
event_type
metadata.product_event_type Value taken from event_type
ip_address
principal.ip Value taken from ip_address
source.item_id
target.file.product_object_id Value taken from source.item_id
source.item_name
target.file.full_path Value taken from source.item_name
source.item_type
Not mapped
source.login
target.user.email_addresses Value taken from source.login
source.name
target.user.user_display_name Value taken from source.name
source.owned_by.id
target.user.userid Value taken from source.owned_by.id
source.owned_by.login
target.user.email_addresses Value taken from source.owned_by.login
source.owned_by.name
target.user.user_display_name Value taken from source.owned_by.name
source.parent.id
Not mapped
source.parent.name
Not mapped
source.parent.type
Not mapped
source.type
Not mapped
type
metadata.log_type Value taken from type
metadata.vendor_name Hardcoded value
metadata.product_name Hardcoded value
security_result.action Derived from event_type. If event_type is FAILED_LOGIN then BLOCK, if event_type is USER_LOGIN then ALLOW, otherwise UNSPECIFIED.
extensions.auth.type Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then MACHINE, otherwise UNSPECIFIED.
extensions.auth.mechanism Derived from event_type. If event_type is USER_LOGIN or ADMIN_LOGIN then USERNAME_PASSWORD, otherwise UNSPECIFIED.

Need more help? Get answers from Community members and Google SecOps professionals.

Create a Mobile Website
View Site in Mobile | Classic
Share by: