Share to: share facebook share twitter share wa share telegram print page

Bulk extractor

bulk_extractor
Original author(s)Simson Garfinkel
Developer(s)Community contributors
Written inC++
Operating systemWindows; macOS; Linux
PlatformCross-platform
TypeDigital forensics
LicenseFree and open-source
Websitegithub.com/simsong/bulk_extractor

bulk_extractor (often written as bulk_extractor) is an open-source digital forensics tool that scans disk images, directories, or individual files to extract artefacts such as email addresses, URLs, phone numbers and credit-card numbers without first parsing file-system structures. It is commonly used for triage and for creating machine-readable “feature files” to support downstream analysis.[1][2]

History

The tool originated in academic research on “bulk data analysis” for forensic triage and feature extraction; a peer-reviewed article described its goals and architecture and reported linear speed-ups from multi-threaded processing.[3]

Design and features

Unlike file-centric approaches, bulk_extractor processes the raw byte stream and writes artefacts to per-type “feature files” together with frequency histograms for triage.[1] Independent practitioner guidance notes its use for incident response and memory/disk workflows, including recovery of network traces from RAM images.[2]

A graphical front-end, Bulk Extractor Viewer (BEViewer), is documented in digital-preservation training and community materials oriented to archives and cultural-heritage workflows.[4]

Usage and adoption

U.S. National Institute of Standards and Technology (NIST) pages describe running bulk_extractor at scale against corpora from the National Software Reference Library (NSRL), publishing dataset runs and limitations encountered in the processing architecture.[5][6] A practitioner-oriented text similarly presents it as a tool for extracting structured artefacts that complement file carvers such as Foremost or Scalpel.[7] Academic work has also cited bulk_extractor as part of broader forensic pipelines (e.g., peer-to-peer investigations) and bulk-analysis methodologies.[8]

Platforms

bulk_extractor is available for Windows, macOS and Linux and is packaged by third parties (for example, Homebrew on macOS).[9]

See also

References

  1. ^ a b "Extracting Forensic Data from a Device Using Bulk Extractor". SpringerLink. Springer. 2021. Retrieved 8 September 2025.
  2. ^ a b "Extracting pcap from memory". SANS Internet Storm Center. 17 December 2015. Retrieved 8 September 2025.
  3. ^ Garfinkel, Simson L. (2013). "Digital media triage with bulk data analysis and bulk_extractor". Computers & Security. 32: 56–72. doi:10.1016/j.cose.2012.09.011. Retrieved 8 September 2025.
  4. ^ "Bulk Extractor Advanced Topics (slides)". BitCurator Consortium. 2017. Retrieved 8 September 2025.
  5. ^ "bulk_extractor Datasets". NIST. 2019. Retrieved 8 September 2025.
  6. ^ "NSRL bulk_extractor 1.4.4 Data". NIST. 2016. Retrieved 8 September 2025.
  7. ^ "Bulk_extractor — Digital Forensics with Kali Linux". O’Reilly. Retrieved 8 September 2025.
  8. ^ "PeekaTorrent: Leveraging P2P hash values for digital forensics". Digital Investigation. 18: S38 – S46. 2016. doi:10.1016/j.diin.2016.04.006. Retrieved 8 September 2025.
  9. ^ "bulk_extractor — Homebrew Formulae". brew.sh. Retrieved 8 September 2025.
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9

Portal di Ensiklopedia Dunia

Kembali kehalaman sebelumnya