[Informatics] Primary and secondary data

Mark mark at vceit.com
Mon May 16 16:13:27 AEST 2016


Hi, SAT navigators

Recently I've been trying to sidle unobtrusively out of the Informatics
back door and leave you young, clever, and energetic people to usurp the
authority you deserve and collectively solve the problems and challenges
raised by the new study design.

Occasionally though, a question arises based on pure theory rather than
classroom practice (with which I am woefully and gratefully out of date.)

My QWERTY fingers start twitching and I feel a burning need to butt in.

Old teaching habits die hard.

As a fun experiment on old teaching habits: sneak up on a group of English
teachers and casually misuse the words "decimate", "literally", and
"irregardless" in the same sentence. It's fun for the whole family.

Anyway.

I seem to remember some people here being uncertain about whether data
downloaded from sites like ABS or CSIRO is primary or secondary data.

Hereunder is my 2.2 cents' worth
(GST inclusive. Tips not included: don't be cheap.)

My understanding is that *primary* data is original data personally
collected from the sources by the student/researcher to answer a particular
question - using surveys, questionnaires, interviews and direct observation.

*Secondary* data is collected, processed, stored and supplied by anyone
*but* the researcher.
Going to a website and downloading a dataset would *not* be primary data
just because you downloaded it yourself.
Reading a magazine article and cutting and pasting its data table is not
primary date. Locating data is not the same as creating data.
*You* did not gather the data.
*You* did not encode it, interpret it, process it, summarise it, store it,
and upload it.
You just *found* it.  It's secondary data. It's not *your* data.
It was collected, processed, often interpreted, and presented by someone
else.

Secondary data is characterised by its being pre-processed by other
parties.
- It may be slightly-processed numeric data (e.g. lists of numbers from
CSIRO that have been sourced, selected, validated, sorted, averaged,
categorised, curated, and cleansed).
- It is often characterised by being presented as other people's opinions,
conclusions or judgements of the meaning of data that has been heavily
processed and interpreted,  e.g. research papers' conclusions, editorial
opinions based on recent government statistics.

I admit that there are grey areas where it may be unclear which is primary
and which is secondary -  such as when someone trawls through decades of
newspaper articles to gather data on the frequency of usage of particular
words. The newspapers would be secondary data, but the extracted data could
be said to be primary since it's used in an original context. But I doubt
this will often be relevant to students and the ITI SAT.

I'm open to a five minute argument
<https://www.youtube.com/watch?v=kQFKtI6gn9Y>. Or even the full half-hour.
I'll be in Room 12A.

Regards
Mark

Another fun experiment: get an English teacher to read this and comment on
its unashamed use of the *Oxford comma*. Let the games begin.

-- 

Mark Kelly

mark at vceit.com
http://vceit.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.edulists.com.au/pipermail/informatics/attachments/20160516/d2f1bbde/attachment.html 


More information about the informatics mailing list