Bo Sundgren – An intellectual biography – No. 8 The last decade

In the early 1970s, Bo wanted his own general theory of data bases, and dissmissed the relational model as too limited. History was not kind to this.

Then there was a mayor ARKSY project at Statistics Sweden. At the end of the 1970s there was even a first mayor UN session about statistical systems.

What became of that we do not know, except that they must have failed. But, we do know that most of the ideas and concepts were in circulation already then.

During the 1980s, everything becomes very quite, at least judging from the publications on Bo’s own web site.

In the early 1990s the old issues from the 1970s reappeared. Bo then started to rehash his old material and publish it in R & D Reports from Statistics Sweden.

In all this, we do not find a single article in a leading academic journal about informatics. Why not?

In any case, we have now reached the final decade. The 2000s. The beginning of this decade saw a concerted effort from the EU to finance “statistical metadata” projects. This is when we find projects such as AMRADS, Metanet and COSMOS. What did Bo contribute via these projects, if anything?

Not much, it seems. In fact, these projects were over all a great failure. And, here is the most important point! These failures were totally unnecessary. If Bo had been honest about learning from this earlier failures they may never have been repeated in the 2000s. Instead, he did the very opposite and sold his old horse to the new international metadata circuit.

Then, suddenly there was SDMX and DDI and the METIS framework. The SDMX consultant did not even know about Sundgren’s earlier models, so they arranged a special meeting between Bo and the consultant. Very little came out of that, except the new term DSD instead of key family (and Bo did not come up with that term himself!).

This has not stopped the international statistical metadata circuit from honoring Sundgren. To the contrary, this was exactly what they had been looking for. A mentor that could teach them how to do nothing slowly, forever.

For example, since 2001 The International Marketing and Statistical Output Database Conference (IMAODBC) awards a yearly “Bo Sundgren Award”, and the prize winners are of course only from statistical agencies. There is no self-denial here.

On Sundgren’s home page we find a large number of publications from the 2000s. What can we say about them?

Of course, he has continued to rehash old material. He has also begun to float more and more on top of things. He has left the more concrete issues and discusses statistical systems and public information systems in an increasingly lofty and abstract manner.

What else is there to do, when you have not been able to solve – or even properly formulate – the fundamental issues to be solved?

This has not stopped Bo from parading as knowledgable about modelling. For example, from 2005, we find a publication with the title “Modelling statistical systems”. Go, figure!

Another typical publication in this genre, is one with the title “Classification of statistical metadata”. Who needs that?

Where are the solutions to real-life modelling problems, stupid!

I think you get the general drift of this. You can always visit the “Publications” page on his home page and see the list of publications from the 2000s yourself!

I will end this intellectual biography with a small contest. If you can find a publication by Bo that has contributed to the solving of a core metadata modelling issue, feel free to contact me!

Bo Sundgren – An intellectual biography – No. 7 Another missing link!

My Swedish source continues to turn up new publications by Professor Sundgren.

I would seem that his bibliography as published on his home page is incomplete.

Here is what he has found:

“RAM – A Framework for a Statistical Production System”, published by Statistics Sweden in 1979.

Another missing link?

Now, here is something interesting. Guess where it was first published!

By the Conference of European Statisticians and their seminar about “Integrated Statistical Information Systems”.

Wow!

However, a quick look at this report suggests that Bo was busy trying to solve data base management issues with his own infological model, issues that were was later solved by commercial relational databases. Maybe that is why this ended up on the historical scrapheap?

On the other hand, the idea of general production modules, for aggregation, etc, can also be found here.

If anything, this just strengthens my critique. If all visions, goals and ideas, even concepts, were on the table already thirty years ago, why has this still not been solved?

The further back in time we can follow this, the more incriminating it gets. In this case, we have revealed an early involvement from the UN.

Bo Sundgren – An intellectual biography – No. 6 The missing link?

My posts about professor Sundgren have been relatively long. This one will be short.

I have asked what happened with Bo’s general theory of databases after this 1974 doctoral thesis.

I think I have found part of the answer! Check out the list of references in the following paper:

Sundgren, B. (2001a). The AlfaBetaGammaTau-model: A theory of multidimensional structures of statistics. MetaNet, Voorburg, the Netherlands.

In 2000, Microsoft had already committed itself in earnest to analytical software. Yet, the only references in this paper are to Sundgren himself. As if no one else in the whole world had written anything worth mentioning about dimensional modelling – except Sundgren himself.

However, the main point with the list of references in this paper is something else. Check out the second publication on the list!

Sundgren, Bo (1975): “Theory of Data Bases”, Mason/Charter Publishers, New York 1975, ISBN 0-88405-307-5.

Wow, his proposed theory of data bases apparently became a book, the year after his doctoral thesis. And it was published in the U.S.

I wonder how that book was recieved? And if it contained any references (to others)? And if  parts of the contents were published in real academic journals?

If we keep digging we can find more interesting tit bits. My Swedish source has just found another. In a report from Statistics Sweden, from 1992, Bo does use references and bitches about himself being early!

“The Entity-Relationship mode is often ascribed to Chen (1976). However, it is a fact that similar conceptual models, originated by several European authors, had been presented and used for at least a decade before the appearance of Chen’s paper; for example; see Langefors (1966), Sundgren (1973, 1974), Durchholz and Richter (1974), Lindgreen (1974).” (see p. 4, “Some properties of statistical information”, 1992:16).

So, Chen published a paper? And Bo, he published – yes, what? A book?

(What we do find at the end of this report is a reference to a contribution by Sundgren to an edited volume from 1974, about data bases. The topic in Sundgren’s contribution was the “Conceptual Foundation of the Infological Approach”).

Bo Sundgren – An intellectual biography – No. 5 A lost decade 1991-1999

So, all the cards were on the table already in 1974. Both ideas about standardizing the production of statistics and the realisation that models, standards, systems, etc, would be required. In the ARKSY project “reuse” was the key concept.

Then what happened?

In 1991 Bo Sundgren and Bengt Rosén published a model for documentation for reuse of micro data (SCBDOK). Go, figure!

Should not that have been solved a long time ago, as a natural part of the ARKSY project? Sixteen years later there is suddenly something in writing about how to document micro data for reuse! What happened during all those years?

Well, let bygones be bygones. Better late than never. Right? Not really. Did they solve the problem? No, they did not. SCBDOK is a model in the abstract that describes what a survey is. It avoids all the tough issues of exactly what, how and how much information is needed to really reuse data. In fact, the concept of reuse is not even explored on a practical level. It is therefore fair to claim that SCBDOK is not really about documentation for reuse. It is much more like the GSBPM than an analysis of what reuse is and what documentation it requires in practice.

What we see is an example of a structural flaw in the way that Bo approaches a problem. He abstracts and theorizes but does not care about implementation and realities on the ground. Much like his OPR(t) model. In 1974 he believed that a general theory of databases was more interesting than Codd’s relational model; a model that he described as limited in its practical use.

We also have to repeat our question. Why did Rosén and Sundgren not publish their SCBDOK model in an academic journal?

Well, in the decade to come Bo would produce three more classics:

Sundgren, B. (1993). Guidelines on the Design and Implementation of Statistical Metainformation Systems. Report for the UN/ECE METIS Group, established within the programme of work of the Conference of European Statisticians.

Sundgren, B. (1995). Guidelines for the Modelling of Statistical Data and Metadata. Published as Guidelines from the United Nations Statistical Division, New York.

This seems to say and solve it all. Architecture! Systems! Data and metadata! Modelling! Right?

Not really. Look at the titles! “Guidelines”. Hm, “guidelines”? Who needs guidelines

Where are the systems, stupid!

Where are the models, stupid!

And why was not anything of this published in an academic journal?

There may be a partial answer to the last question. Remember the old adage, that besides Ovid he only quoted himself? I can recommend that you click on the links and then look through these documents for references.

Who do you think appears in the bulk of those references? Some 10 out of 16 are Sundgren himself. And the others? Well, they are also from the statistical metadata circuit. It even seems that several are from the same project or conference.

In these lists we also find a partial answer to what happened between 1974 and 1991. There are a few publications by Sundgren and Malmborg and Sundgren. 1980, 1984 and 1989. Also, we find an article published in an academic journal from 1985! However, not by Sundgren and in Statistics Sweden’s own journal (JOS).

Where are the articles in real academic journals, stupid!

We also find that the OPR(t) and alpha-beta-tau models have resurfaced. Why do these models appear already in 1974 and then remain in oblivion until the mid 1990s, when they are suddenly published by the UN? Without references? Surely, by then there was plenty of new research and theory about, for example, dimensional modelling that could be referenced?

Who has heard of a professor that publishes his results only via government institutions and conferences?  What kind of a professor is that? What has he achieved? Why is he afraid of real peer reviewing? Why has he not published anything in academic journals? (There it is, again, that obnoxious question).

There is a bigger picture here. As before, we can give Bo the benefit of the doubt. Maybe statistical computing is very different in the NSO/ISO sector, and he has truly been a pioneer in this area. Maybe that is why he does not have any real articles or references?

However, this does not fly. Already in the 1974 ARKSY report he claimed that the project was of interest outside of the NSO community. It is also clear that he has been intellectualy lazy in referencing others and that he has avoided the necessary peer reviews of his OPR(t), etc, model.

The very unfortunate result of this is that it has become customary within the statistical metadata circuit to not build bridges and validate their work in the real world. You quote Sundgren and Sundgren quotes himself.

However, worse than that is what this has done to problem solving. If NSO/ISO requirements are very different, than exploring and understanding those differences should be at the heart of all statistical metadata concerns. Is it? No, they seem happy with their Neuchatel, and their DDI, and their SDMX, and their GSBPM without referencing, comparing or exploring what has been done in the private sector.

Bo Sundgren’s intellectual biography tells us something about how statistical metadata research developed into a closed shop.

It gets worse. There is in fact reason to suspect that Bo some time between 1974 and 1991 realized that either the goals were not possible to meet or he was not capable of finding a solution. So, he salvaged what he could, and created a closed shop for himself where he could take on the role of pioneer and guru, and no one would hold him accountable.

And it worked.

Today, the same thing is being done, but now on a level and a scale that no one could dream about in the 1990s. That includes the level and scale of charlatanism.

Footnote: The SCBDOK model as described in 1991 can be compared with the current GSBPM and especially the GSIM. It is still better than the latter. In the realm of the blind, the one eyed is king!

Bo Sundgren – An intellectual biography – No. 4 ARKSY

So, Bo finished his doctorial thesis in 1973. In it he predicted a unified theory of databases and that the relational model would not be able to solve the bulk of database needs. Then what?

On his blog we find a number of internal reports from Statistics Sweden, from 1974. That is all until 1991!

(I rely on my Swedish informant, so there may be one or two errors of translation. If so, let me know and I can correct them!)

The key acronym for these activities in 1974 seems to have been “ARKSY”. What on earth was that? And why was that the only thing Bo produced for eighteen years?

(At least produced, that was R & D, and that is now possible to publish on his blog. If you know of any other R & D publication by Bo between 1975 and 1990, please send me an e-mail!)

So, ARKSY was an acronym for “archive statistical system”. The purpose of this project was to increase reuse and recombination of statistical data within Statistics Sweden. In its graphical form, the vision was the classical attempt to transcend the stove-pipe organisation. An “input data warehouse”, if you will.

The background for ARKSY was described as changed external circumstances and it was envisioned that ARKSY would have a general influence on the way that statistics was produced. Have we heard this before? Or, better, how many times have we heard the same thing since then?

It gets better! One of the first steps was to propose a “frame” as a basis for a more strict theory of the production of statistics. Wow! Does that not sound a bit like the GSBPM?

In the description of chapter seven in the report there was further mention of “conceptual systems”, “frame models” and “standards” that were of importance to the realisation of the ARKSY. Does this sound familiar?

As concrete systems the vision was that the project would result in a variable and file catalogue, a database management system, and a record (or file) linkage system. Go, figure!

But, what about the surveys and standardization? The hard core content basis of standardisation!

Do you remember my posts about statistics as art or science? I mentioned that the case for standardization must be made by being specific about individual surveys. It must also be specific regarding different aspects of a survey. I specifically mentioned the statistical unit.

In 1974, the ARKSY project provided a full analytical model for this! Wow! There is very little new under the sun.

This analytical model included: (1) statistical units, (2) populations, (3) samples, (4) variables (including classifications), (5) time and (6) (physical?) access to data!

I wonder when this comprehensive analytical scheme was lost to the international metadata circuit? The exact same rethoric about standardization can now be found in international fora, via HLG-BAS. Yet, one searches in vain for an analytical model of this sort.

On the other hand, if the Swedes have been thinking about and trying to do this since 1974, what are the chances that the international community will be successful in the 2010s?

Clearly, something is wrong here. I think that it is easy to see that with as many as six parameters, the probability that each survey will be unique enough to make reuse difficult is very high. If it is not the definition of the object type, it is the sample, or the time, etc.

Then the report goes slipshod. It claims that there is some reuse and recombination at Statistics Sweden, but that the potential is much higher than the current practice. Really!? How did they know that for a fact?

Maybe they did not, but they later found out they were wrong? The report actually suggested that a next step was for the subject matter departments to suggest “new archival statistical products”. What happened with that?

What are the conclusions of this?

– That the same abstract visions have been around for over three decades, without achieving their goals.

– Initial claims are made without being been substantiated by concrete examples.

– If it is true that there was an attempt to find concrete examples at Statistics Sweden, already in the 1970s, then we would like to know why there never was an ARKSY system.

Of course, what this suggests is that there was no real empirical basis for these visions. Worse than that, it also suggests that Bo realised this already thirty years ago.

My suggestion is that everyone who has contact with Bo should ask him what happened with the ARKSY project and the attempts to standardize and define “new archive statistical products” (and a frame, and standards, and theory, and a variable and file catalogue, etc).

If he cannot give a straight answer, we must conclude that he is a fraud.

Bo Sundgren – An intellectual biography – No. 3 Where are the articles?

We have already noted that Bo finished his doctoral thesis in the early 1970s. What happened after that?

According to his blog, he was a professor of information management 1979-1982 and 1987-2003. Inbetween he was also a professor, but unclear of what and for how long.

If we then look at his list of publications we find no publications before 1991, and no publications at all in leading academic journals about informatics or information management!

What we have here is the strange phenomenon of a professor of information management, who during most of his career has not published a single article in a leading academic journal.

It gets worse. He has also practically no publications at all in any academic journals. So, what has this “professor” been doing?

It turns out that Bo Sundgren’s “career” as an “expert” on informatics and information management is a closed-shop, public sector career. He has mainly published himself through his institutional affiliations, especially the Stockholm School of Economics, Statistics Sweden, EU projects and the UN.

What is the credibility of an “expert” and professor of information modelling and information management, who has practically never published himself in an academic journal?

Has anyone ever really peer-reviewed Bo?

We could of course also flipp this around, give Bo the benefit of the doubt, and look for structural explanations. If so,

1. There may be something seriously wrong with the communication between NSO and ISO systems research and that of the rest of the informatic community. If so, why?

2. The NSO metadata circuit may be doing something that is of little or no interest to the rest of the informatics community. If so, why?

– Because it is so special and different?

– Because it does not contain anything special or new?

Which ever way we go with this, it does not look good. Exactly what is wrong with this picture?

Maybe we can find an answer to this question if we take a closer look at the contents of what Bo has published?

Bo Sundgren – An intellectual biography – No. 2 Pioneer or phony?

Here is professor Bo Sundgren’s own description of the results of his doctoral thesis from his blog (I also copied the links):

“His doctoral thesis, An Infological Approach to Data Bases, introduced the OPR(t) methodology for conceptual modelling of reality and information about reality, as well as the metadata concept, and a methodology for modelling aggregated statistical data by means of multidimensional hypercubes, the alfa-beta-gamma-tau methodology

As we have seen, international discussions and proposals about information modelling were already well under way when Bo was finishing his thesis. What, exactly, did Bo contribute to this?

On page 454ff in his thesis he discussed this himself. He suggested that his theory was more general and that there were still hopes for a more general data base theory that would combine the IBM, CODASYL and his own infological approach.

He also commented specifically on Codd’s relational database model:

“However, the marketing of Codd’s ideas might have given somebody the false impression that the relational approach to data bases would solve the bulk of data base problems”

Really!?

We would like to know what happened with this? Has his OPR(t) model caught on? Did anyone, internationally, know about it, try to use it, or comment on it? Exactly how new and revolutionary was this OPR(t) model? Was this model perhaps just a rip off or a simple reformulation of what was self-evident, already known, or had been developed elsewhere? Did Bo already then have a poor sense of what is important and applicable in the real world? Was there ever a more general theory of data bases? How does it relate to existing standards today, such as UML?

Similar questions can be asked about the alpha-beta-tau model. What is the difference between this model and classical data warehouse dimensional modelling? When did the theory of dimensional modelling develop for the first time? If Bo was first, why is this not more well known?

I do not have a final answer to these questions. However, my point is something else:

1. It seems that Bo made a serious error of judgement. The relational model has indeed solved the bulk of data base problems. In fact, his own alpha-beta-tau Nordic model for statistical databases is implemented using that model.

2. Informatics in national statistical agencies, including metadata modelling, seems to have a life of its own, without proper interaction with the rest of the informatics community. This is a serious issue, and it begins with Bo.

3. What exactly is the value of Bo’s theorizing to the resolution of practical problems? Why is he allowed to produce one report after the other without having to produce real results?

Back to the initial question: Is Bo a pioneer or a phony?

My answer is that the real issue is that no one in the NSO metadata circuit seems to have asked this question; and that we will never know unless this circuit is properly integrated in the general informatics community.

Has Bo ever published any of his theories and models outside of this circuit, in leading academic or other journals about informatics and computing? If not, why not?

The importance of this issue of splendid isolation – from the informatics community and from evaluation based on practical results – will become clearer as we move on with Bo Sundgren’s intellectual biography.