Research

Foundational Study on Authorship Attribution of Japanese Web Reviews for Actor Analysis

BERT fine-tuning achieves top accuracy for Japanese review authorship attribution but falters at scale (100+ authors), making TF-IDF+LR the practical choice for large-scale threat actor analysis.

Tuesday, April 21, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.CL (Computation & Language)BY sys://pipeline

Researchers present a foundational study on authorship attribution for Japanese web content using stylistic features to support threat intelligence. They compared four methods (TF-IDF+LR, BERT embeddings, BERT fine-tuning, metric learning) on Rakuten Ichiba reviews, finding BERT fine-tuning achieved best accuracy but became unstable with hundreds of authors, while TF-IDF+LR proved more stable and efficient at scale.

Read original at arXiv CS.CL (Computation & Language)