Prune packet capture files without losing statistical information

Prune packet capture files without losing statistical information

One of the hardsells in Network Security Monitoring is to convince a user that storing packet trails is a feasible proposition. Given that storing every byte that ever crossed your wire is too expensive , we have to come up with a technique to prune and keep the most useful packets. Those that are likely to of help in a future investigation.

Huge old capture files can be optimized

Some techniques :

  1. Do not store traffic between known trusted endpoints, such as backup servers
  2. Only store the first MB of every flow

The second technique is surprisingly effective because it is a well known secret in network traffic monitoring that a large majority of traffic by volume is carried by a tiny number of flows. A number of tools like [TrimPCAP from NETRESEC] apply this technique.

Trim at capture time or later

In Trisul, you can specify a policy that filters at capture time itself [Store only 1MB per flow] Yet some of our customers want full content for at least a few days and then dont mind losing a bit of resolution beyond that. Our latest release includes a free tool called trisul_flowcap that allows you to prune already captured PCAP dumps.

A new technique : Sampling after threshold

A naive approach to per-flow packet pruning would simply drop all packets that belong to a flow that has already transferred greater than the threshold bytes. This works but the resulting PCAP loses the statistical and flow information.

  1. the bandwidth volumes in the pruned PCAP wll be much lower due to loss of these elephant flows
  2. the flow durations in the pruned PCAP will be very short, not reflective of the actual duration

Metrics are very important to Trisul so we came up with the novel idea to use a sampling rate after the threshold limit. Then adjust the WireLength in the pruned PCAP to account for missing bytes. The sampling rate chosen should also account for TCP timeouts between packets. We found very little extra over head due to sampling.

The —samplerate=N option

Using the --samplerate=100 option you can instruct trisul_flowcap to save 1 per 100 packets for flows that have crossed the threshold volume. Then add the bytes skipped to the WireLength of the sampled packet. The following Wireshark output shows these giant (by wirelength) packets

Dryrun option

We also added a --dryrun option. This runs much faster because there is no writing involved. Use this to estimate the compression achievable. All runs produce detailed performance reports as shown

DOCKER:unplprotectli:root oper$ trisul_flowcap -c /usr/local/etc/trisul-probe/domain0/probe0/context0/trisulProbeConfig.xml  --capbytes=1000000 --samplerate=100 -i RCF_triscap.wseXmJ -o RCF_triscap.wseXmJ.flowcapped
Progress       : ||||||||||||||||||||  100%
In Bytes       : 998931121 (952.65 MB)
Out Bytes      : 313149410 (298.64 MB)
Compression %  : 68.6516 % 
Flows          : 18072
Flows capped   : 85
Flows capped % : 0.470341 % 
Method         : sampled if over flow over 1000000 bytes
Input          : RCF_triscap.wseXmJ
Output in      : dry run no output
Elapsed time   : 3 seconds

DOCKER:unplprotectli:root oper$	

This shows the win : You have compressed the PCAP file by 68% while only 0.5% of the flows are impacted !!

Full documentation of trisul_flowcap


If you have a running Trisul installation, you can run the following script to prune all PCAPs

  1. older then 14 days => 1MB per flow
  2. with sampling of 1:100 on all
 OLDPCAPS=$(find /usr/local/var/lib/trisul-probe/domain0/probe0/context0/caps/ -type f -mtime +14)
 for  f in $OLDPCAPS
      trisul_flowcap -c /usr/local/share/trisul-probe/domain0/probe0/context0/trisulProbeConfig.xml \
          -i $f -o $f.capped -n 1000000 -s 100 
      mv $f.capped $f

Why does the tool need a Trisul installation

Currently trisul_flowcap is installed along with the trisul-probe package. This is required because the tool depends on the Trisul framework to do the protocol parsing to get to the TCP/UDP/IP layer from IPv4/IPv6 and lower link layers. You can also use the tool to convert between PCAP files in tcpdump format.

We hope the network security monitoring community finds this tool and the technique useful !!

Full documentation of trisul_flowcap

Download Trisul today. Don’t just monitor, start NSM-ming !