The oddly high impact of 'security' conferences

10 Mar 2010

While researching something unrelated, I stumbled across an interesting feature of CiteSeerX: "estimated venue impact factors." That is, it attempts to rank CS-related conferences and journals in terms of their 'impact.' However, something seems to be wrong with their algorithm-- there is no way that a single sub-specialty (security) can contain eight of the top 25 conferences.

For those of you with no idea what I'm talking about: There are a number of sites that attempt to catalog and organize the vast, inter-connected web of academic papers in Computer Science. For example, DBLP, CiteSeer and CiteSeerX all allow you to search for papers by venue, papers by author, find papers that cite a particular paper, etc. etc. etc. (I'm sure that there are similar sites for other specialties, but I'm not familiar with them.)

Once you've got all this information, though, it becomes tempting to do something interesting with it. DBLP, for example, will list aggregate information about authors such as their co-authors, the venues they publish at, and how much they published each year. Their page on me, for example, reveals some decidely uncomfortable truths about my recent academic output.*

But before I dwell too much on that depressing thought, let me quickly return to the venue-impact estimates I mentioned earlier. CiteSeerX attempts to estimate 'Garfield's traditional impact factor' for each conference, and thus produces a ranked list. At the time of this writing, the list contains 581 conferences and the top 25 are as follows:

  1. POPL 0.45
  2. OSDI 0.43
  3. PLDI 0.4
  4. ACM Conference on Computer and Communications Security 0.39
  5. S&P 0.37
  6. NSDI 0.37
  7. CSFW 0.33
  8. ASPLOS 0.32
  9. SIGCOMM 0.31
  10. RAID 0.31
  11. EuroSys 0.3
  12. FAST 0.3
  13. TCC 0.26
  14. IPTPS 0.26
  15. CGO 0.25
  16. CRYPTO 0.25
  17. VMCAI 0.25
  18. TACAS 0.25
  19. SAS 0.23
  20. CAV 0.22
  21. ESOP 0.22
  22. LCTES 0.2
  23. USENIX Annual Technical Conference, General Track 0.19
  24. EUROCRYPT 0.17
  25. Public Key Cryptography 0.17

The ones in bold are security related, and you'll note that there are eight of them. To which I say: really? A full third? There's obviously something wrong here, and a quick glance at Wikipedia reveals what:

In a given year, the impact factor of a journal is the average number of citations to those papers that were published during the two preceding years. For example, the 2008 impact factor of a journal would be calculated as follows:

  • A = the number of times articles published in 2006 and 2007 were cited by indexed journals during 2008
  • B = the total number of "citable items" published in 2006 and 2007. ("Citable items" are usually articles, reviews, proceedings, or notes; not editorials or Letters-to-the-Editor.)
  • 2008 impact factor = A/B

Ah, that explains things. It is not that security necessarily has a particularly high impact, but that we cite ourselves more frequently than other sub-disciplines. There could be a number of reasons for this, but I suspect that this is mostly just a cultural thing. And if this is the case, it is probably a mistake to use the same impact-estimate statistic to compare conferences across different sub-specialties. That is, this list might be useful to compare CSF to CRYPTO, for example**, but not CSF to POPL.

* But I hasten to note that that's the page for "Jonathan C. Herzog." Most annoyingly, they seem to regard "Jonathan C. Herzog" and just "Jonathan Herzog" as different people.

** Well, not really. But the comparison between EUROCRYPT and Financial Cryptography is probably fair.

Share this

Actually Citeseer is

Actually Citeseer is sufficiently uncommon as to be worthy of frequent mention. I think most fields use something like the ISI Web of Science citation index (indexes? there might be 3) if they want to investigate impact.


(I've actually been poking

(I've actually been poking around a bit with impact factor and related ideas lately. One of the frequent arguments for open access journals (i.e. with freely available content) is that it increases the impact factor, and physics and CS are often looked to for supporting evidence. But I keep wondering if this is a nonscalable thing -- that open access increases impact factor in a world where most papers are not OA, because the ones that are are disproportionately available hence cited. Or that drawing too much evidence from a handful of disciplines -- which are, admittedly, the ones with the most established history of OA and the best tools for statistics generation -- may rest too much on particular cultural features of those fields. Maybe, as you say, y'all just *like* to cite each other. (Maybe because you're so conference-driven -- papers you encounter, or authors you encounter, in that social context have more of a grip on your thinking than papers you encounter in a database might, and as such you're apter to cite them, either because you've been thinking about their content or because you feel some emotional pressure to do so?))


note to self: cite myself and friends more often

I think this list shows lots of things that are wrong about formulas for relative impact based on citation counts.

a) subfield specific, as you point out.

b) Even within subfields, weird things happen. TCC, for example, shows up on the list well above both Crypto and Eurocrypt, even though a poll of crypto researchers would almost certainly show that Crypto/eurocrypt are thought of as having higher impact, in the sense of being more widely read and influential of the field's general direction.

c) Why does TCC show up but not, say, SODA, STOC or FOCS? My *guess* is that SODA/STOC/FOCS are broader and benefit less from the network effect of having lots of self-citation within a tightly connected community.

[That said, a quick check on Google Scholar indicates that my two most cited papers appeared at Eurocrypt and TCC.]


according to a more

according to a more comprehensive bibliographical analysis tool,
only 3 out of the top 25 confs are security-related.

http://academic.research.microsoft.com/CSDirectory/Conf_category_24.htm