# A New Explanation of Some Leiden Ranking Graphs Using Exponential Functions

• Egghe, Leo (Campus Diepenbeek Universiteit Hasselt)
• Accepted : 2013.08.06
• Published : 2013.09.30
• 123 171

#### Abstract

A new explanation, using exponential functions, is given for the S-shaped functional relation between the mean citation score and the proportion of top 10% (and other percentages) publications for the 500 Leiden Ranking universities. With this new model we again obtain an explanation for the concave or convex relation between the proportion of top $100{\theta}%$ publications, for different fractions of ${\theta}$.

#### Keywords

Leiden ranking;exponential function;mean citation rate;S-shaped

# 1. INTRODUCTION

For 500 universities (from 41 countries) from the Leiden Ranking 2011/2012 one observes in Waltman et al. (2012) the relation between the mean normalized citation score (MNCS) and the proportion of top 10% publications(PPtop10%).

Upon specifying a field, MNCS is the mean number of citations of the publications of a university in this field (normalized in several ways – see Waltman et al.). PPtop10% is the proportion (fraction) of the publications of a university in this field that, compared with other publications in this field, belong to the top 10% most frequently cited.

In Waltman et al. one finds an S-shaped relation between PP$_{top10％}$ and MNCS: first convex then concave (see their Fig. 2). Allowing some other percentages, Waltman et al. find a convex relation between PP$_{top10％}$ (as abscissa) and PP$_{top5％}$ (as ordinate) and a concave relation between PP$_{top10％}$ (as abscissa) and PP$_{top20％}$ (as ordinate) – again see their Fig. 3.

In Egghe (2013) we explained all these regularities using the shifted Lotka function

$f(n)={C \over (n+1)^a}$ (1)

where 𝐶>0, 𝛼 >1, 𝑛 ≥0, which was documented in Egghe and Rousseau (2012). Here 𝑓(𝑛) is the continuous version of the number of publications with 𝑛 citations. Using (1) we studied the functional relation between the non-normalized variant of MNCS, denoted MCS and PP$_{top10％}$ . Putting 𝑥 = 𝑀𝐶𝑆 and 𝑦 = PP$_{top10％}$ we proved in Egghe (2013) that

$y=10^{-{{1\over x}+1 \over a-1}}$(2)

where 𝛼 is the exponent of Lotka in (1), where we also proved the S-shape, hereby explaining this empirical relationship in Waltman et al. (2012). For general fractions 𝜃 we obtained in Egghe (2013)

$y=\theta^{{{1\over x}+1 \over a-1}}$  (3)

where 𝑥 =𝑀𝐶𝑆 and 𝑦 = 𝑃𝑃 (𝜃) ≕𝑃𝑃$_{top100𝜃％}$ .

From this model we proved, for any fractions 𝜃$_1$, 𝜃$_2$, the following functional relation between 𝑃𝑃 (𝜃$_1$) and 𝑃𝑃 (𝜃$_2$) :

$PP(\theta_2)=PP(\theta_1)^{{{In\theta_2} \over In\theta_1}}$(4)

which is an explanation of the convex and concave graphs in Waltman et al.: concave if 𝜃$_2$ : > 𝜃$_1$ and convex if 𝜃$_2$< 𝜃$_1$.

In this paper the same problems are studied: explaining the S-shaped relationship between MCS and 𝑃𝑃 (𝜃) for any 𝜃 and the convex or concave relationships between two 𝑃𝑃 (𝜃$_1$) and 𝑃𝑃 (𝜃$_2$) as found in Waltman et al. Now, however, we do not use the shifted Lotka function (1) but the exponential function (a very classical function)

$f(n)=Ca^{-n}$ (5)

where 𝐶>0, 𝑎>1, 𝑛≥0 where the function 𝑓(𝑛) has the same meaning as explained above. With 𝑥 = 𝑀𝐶𝑆 and 𝑦 =𝑃𝑃 (𝜃) we will prove (in the nextsection) that

$y=\theta^{{1 \over x In a}}$  (6)

which is a clearly different function when compared with (3). But also this regularity explains the one found (empirically) in Waltman et al. since (6) is also Sshaped.

Remarkably, in the third section, using (6) for two fractions 𝜃$_1$ and 𝜃$_2$, we will reprove (4); i.e., the same regularity between any two 𝑃𝑃 (𝜃$_1$) and 𝑃𝑃 (𝜃$_2$) is found using exponential functions as when we used the shifted Lotka function (which are clearly different functions). But, at the end of the paper, we will also give two cases where (4) is not valid.

The paper closes with a conclusion and open problemssection.

# 2. EXPLANATION OF THE RELATION BETWEEN MCS AND PP(𝝷)

As indicated in the introduction we use the exponential function

$f(n)=Ca^{-n}$     (7)

denoting the continuous version of the number of publicationswith 𝑛 citationsin a fieldwhere 𝐶>0, 𝑎>1, 𝑛 ≥0. Since the field is fixed, we have also that 𝐶 and 𝑎 are fixed.

For a university we use the exponential function

$\varphi(n)=C'a^{'-n}$    (8)

(𝐶＇> 0, 𝑎＇>1, 𝑛 ≥0 ) denoting the continuous version of the number of publications of this university with 𝑛 citations. Since we deal with several universities (e.g. 500 in the case of Waltman et al. (2012)) we have here that 𝐶＇and 𝑎＇are variables.

As noted by one referee, the fact that, per university, we use (8) with 𝐶＇and 𝑎＇variables, does not necessarily lead to (7) for the entire field. Therefore, formula (7)
should be considered as an assumption to be valid in practice:

Denote by 𝑇 the total number of publications in the entire field and by 𝑇＇the total number of publications in a university in this field. We have, by definition of 𝑓(𝑛)

$T=\int_{0}^{\infty}f(n) \quad dn={C \over In \ a}$       (9)

(since 𝑎>1) and similarly

$T'={C' \over In \ a}$ (10)

Denote by 𝐴 the total number of citations in the entire field and by 𝐴＇the total number of citations in a university in thisfield. We have, by definition of 𝑓(𝑛)

$A=\int_{0}^{\infty}n\ f(n) \quad dn={C \over (In \ a)^2}$ (11)

which is easily seen using partial integration, by the fact that 𝑎>1 and the fact that

$\lim_{n \to \infty}={n \over a^n}=0$(12)

Similarly we have

$A'={C' \over (In \ a')^2}$(13)

By definition of MCS, being the average number of citations per publication of a university, we have

$MCS={A' \over T'}={1 \over In \ a'}$(14)

, using (10) and (13).

We first determine 𝑛0 defining the top 100𝜃% publicationsin the field (for any fraction 𝜃), by (7) :

$\int_{n_0}^{\infty}Ca^{-n}\ dn=𝜃T$(15)

From (15) it follows that

${C \over In \ a}a^{-n_0}=𝜃T$

and by (9) we have

$a^{-n_0}=𝜃$

or

$n_0=-{In\ 𝜃 \over In\ a}$(16)

which is a positive number because 0<𝜃<1.

Then the university proportion in these top 100𝜃% of the papersin the field is, by (8)

$PP(\theta)={1 \over T'} \int_{n_0}^{\infty}C'a^{'-n}$

$PP(\theta)=a^{1-n_a}$

$PP(\theta)=\theta^{{1 \over MCS}{1 \over In \ a}}$(18)

(by (14)) or the function (6). This is an increasing function (since 0<𝜃<1) for which (𝑥 =𝑀𝐶𝑆)

$\lim_{x \to \infty}=PP(\theta)=1$    (19)

and

$\lim_{x \to 0}=PP(\theta)=0$     (20)

The number 𝑎 is a fixed parameter (of the field). Fig. 1 is the graph of (18) for ln 𝑎 = 0.8 from which the S-shape is clear, and it is close to the S-shape obtained in Waltman et al. (2012). So this represents a new explanation of thisregularity.

Fig. 1 Graph of (18) for ln a = 0.8

# 3. EXPLANATION OF THE RELATION BETWEEN ANY TWO VALUES OF PP(𝜃$_1$)AND PP(𝜃$_2$)

For any two fractions 𝜃1 and 𝜃2we have, by (18)

$PP(\theta_1)=\theta_1^{{1 \over MCS}{1 \over In \ a}}$(21)

$PP(\theta_2)=\theta_2^{{1 \over MCS}{1 \over In \ a}}$(22)

Hence

$In\ PP(\theta_1)={{1 \over MCS}{1 \over In \ a}} In\theta_1$

$In\ PP(\theta_2)={{1 \over MCS}{1 \over In \ a}} In\theta_2$

from which it followsthat for any two fractions 𝜃$_1$and 𝜃$_2$

${In\ PP (\theta_1)\over In\ \theta_1} = {In\ PP (\theta_2)\over In\ \theta_2}$(23)

hence

$e^{In\ PP (\theta_1)\over In\ \theta_1} = e^{In\ PP (\theta_2)\over In\ \theta_2}$

or

$PP(\theta_1)^{1 \over In \ \theta_1}=PP(\theta_2)^{1 \over In \ \theta_2}$

$PP(\theta_2)=PP(\theta_1) ^{In \ \theta_2 \over In \ \theta_1}$(24)

which is (4).

As already remarked in the introduction, this relation is the same as the one found in Egghe (2013) Where the shifted Lotka function was used, a remarkable fact!

We have derived that (since 0 < 𝜃$_1$, 𝜃$_2$< 1), if 𝜃$_2$ > 𝜃$_1$ , then (by (24)), 𝑃𝑃 (𝜃$_2$) is a concave function of 𝑃𝑃 (𝜃$_1$) and that, if 𝜃2 < 𝜃$_1$, then (by (24)), 𝑃𝑃 (𝜃$_2$) is a convex function of 𝑃𝑃 (𝜃$_1$) ̶see the graphs in Egghe (2013) which explain the  orresponding graphs in Waltman et al. (2012).

So (24) is valid when 𝑓 and 𝜑 are both shifted Lotka functions (proved in Egghe (2013)) and when 𝑓 and 𝜑 are both exponential functions (proved here). Now we present two cases where (24) is not valid.

## Case I

We take 𝑓 to be a shifted Lotka function and 𝜑 to be an exponential function:

$f(n)={C \over (n+1)^a}$(25)

(C>0, $\alpha$>2, n$\ge$0)

$\varphi(n)=C'a^{'-n}$   (26)

(C'>0, $\alpha$'>1, n$\ge$0)

Following the method of the previous section we find, for every fraction 𝜃

$PP(\theta)=e^{{1 \over MCS}{(1-({1 \over \theta})^{1 \over a-1}})}$(27)

From the method in this section we find, for any two fractions 𝜃$_1$ and 𝜃$_2$,

$PP(\theta_2)=PP(\theta_1)^{{1-({1 \over \theta_2})^{1 \over a-1}} \over {1-({1 \over \theta_1})^{1 \over a-1}}}$(28)

which is, clearly, not the function (24).

## Case II

We take 𝑓 to be an exponential function and 𝜑 to be a shifted Lotka function:

$f(n)=Ca^{-n}$        (29)

(C>0, $\alpha$>1, n$\ge$0)

$\varphi(n)={C' \over {(n+1)^{a'}}}$(30)

(C'>0, $\alpha$'>2, n$\ge$0)

Following the methods of the previous section we find, for every fraction 𝜃

$PP(\theta)=(1-{In \theta \over In \ a})^{-1-{1 \over MCS}}$(31)

From the method in this section we find, for any two fraction 𝜃$_1$ and 𝜃$_2$ ,

$PP(\theta_2)=PP(\theta_1)^{{In(1-{In \theta_2 \over In \ a}) \over {In(1-{In \theta_1 \over In \ a})}}}$(32)

which is, clearly, not the function (24).

## 4. CONCLUSIONS AND SUGGESTIONS FOR FURTHER RESEARCH

Experimental regularities in Waltman et al. (2012) are proved mathematically in this paper. Using the exponential function we proved the S-shaped functional relation between the mean citation rate and the proportion of top 100𝜃% publications. We obtained a different function than in Egghe (2013) where a shifted Lotka function was used, but we obtained an S-shape in both cases.

With this new model we could reprove the function (obtained in Egghe (2013))

$PP(\theta_2)=PP(\theta_1) ^{In \ \theta_2 \over In \ \theta_1}$(33)

for the relation between two 𝑃𝑃 (𝜃)-values. It is very remarkable that we obtain exactly the same function as in Egghe (2013) although different starting functions were used (shifted Lotka in Egghe (2013) and exponential here). We also showed that (33) explains the corresponding empirical regularities in Waltman et al. (2012).

The importance of this paper is that the assumption of a simple exponential function (5) leads to an explanation ofseveralregularitiesin Waltman et al.

We state as an open problem: can the S-shape in Waltman et al. for the relation between the mean citation rate and the proportion of top 100𝜃% publications be proved using other starting functions (other than the shifted Lotka function and other than the exponential function)?

Also the following is an open problem: characterise the functions 𝑓(𝑛) and 𝜑(𝑛) (previoussection) for which (33) is valid. From Egghe (2013) and this paper, this class of functions must include the shifted Lotka function and the exponential function.

#### References

1. Egghe, L. (2013). Informetric explanation of some Leiden Ranking graphs. Journal of the American Society for Informetric Science and Technology, to appear.
2. Egghe, L. & Rousseau, R. (2012). Theory and practise of the shifted Lotka function. Scientometrics, 91(1), 295-301. https://doi.org/10.1007/s11192-011-0539-y
3. Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E. C.M., Tijssen, R. J.W., van Eck, N.J., van Leeuwen, T.N., van Raan, A. F.J., Visser, M.S., & Wouters, P. (2012). The Leiden Ranking 2011/2012: Data collection, indicators, and interpretation. Journal of the American Society for Information Science and Technology, 63(12), 2419-2432. https://doi.org/10.1002/asi.22708