ULASAN JURNAL (Lampiran)- By Haziq Azree

7/27/2019 ULASAN JURNAL (Lampiran)- By Haziq Azree

1/13

0/28/12 International Test Commission Publications orta

1/13.intestcom.org/Publications/ORTA/Social implications and ethics of testing.php?print=true

Table of contents

Social implications and ethics of

testing

Abstract

Introduction. Ethics a s reflection

and personal choice

What is ethical? Right and wrong

beyond the law What ma kes a test ethical?

(a). Adherence of the testing

process to the ge neral principles of

the scientific method

(b). Characteristics of the

interaction process between the

testing professional a nd the other

stakeholde rs involved in the testing

process

Informed concent

Test se curity and user

qualifications

References

Suggested hyperlinks to web

pages

Questions for classroom

discussion

Author information

Social implications and ethics of testingDragos Iliescu, Dan Ispas a nd Michae l Harris

Uploaded June 2009

Abstract

The paper aims at increasing awa reness o f testing professionals towa rds ethical issues.Ethics is treated as a meta-category, defining right and wrong beyond law, morality orreligion. Internationa l guidelines addressing e thics in the field of psychological testingare discussed. We discuss w hat makes a test and the testing process ethical: theadhe rence to the gene ral principles o f the s cientific method (objectivity, reliability andvalidity) and some characteristics of the interaction process between the testingprofessional and the other stakeholders involved in the tes ting process (fairness,procedural justice, sharing and communication of results, informed cons ent, tes t securityand user qualifications).

Introduction. Ethics as reflection and personal choice

A discussion on the ethics of testing probably should begin with a rigorous definition of wha t ethics is. But ethics is difficult to pin dow n an d most theo rists in the field of ethicsavoid giving clear definitions, preferring instea d to build the case for ethics by examples.We should state for the beginning that ethics de scribes criteria for asse ssing theappropriateness of behaviors, be they actions, decisions, or intellectual stances.

Ethics is from a scientific point of view a branch of philosop hy. Philoso phers distinguishbetween normative ethics, which is a prescription on what people should believe to beright and wrong, and applied ethics, which focuses on the examination of specific real-life situations. As such, applied ethics isnot exclusively the turf of philosophe rs, but also of the practitione rs who are confronted by specific real-life issues (Barnhart,2002).

In a very broad sense, ethics refers to the principles of right and wrong conduct. Ethical standards are prescriptions aboutwhat humans in general (or in our case professionals) ought to do, usually worded in terms of rights, obligations or benefits to

society or for a greater good. As such, ethics has very loose boundaries with other domains, such as morality, religion andlaw. We will address some of these differences in a later section, building the case for a definition of ethics as a significantlybroader intellectual endeavor than the common conception of analyzing right and wrong.

In our understanding, ethical judgments are a ction-guiding, while no t being pres criptive. Of course, ethics is normative andprescriptive: it is concerned with how we ought to act and what results we ought to try to bring about. Still, while ethics isnormative, stating how we ought to act, and what results we ought to aim for in our actions, ethics is more than a set of principles. It is a matter of course that ethical behavior has to be based on sound principles and values. However, ethics is notan a utomatic comparison of a real-life situation to a set of no rms, it also req uires active intellectual processing. Theimposs ibility of automatic normative judgment is given by the fact that rea l-life situations most often req uire the p ractitione r toreact to complicated issues , which have b earing o n multiple and conflicting values o r ethical principles, thus de fining eth icaldilemmas. It is therefore a ppropriate to state that e thics is the s tudy of what happe ns w hen there a re no simple a nswe rs to asituation.

Howe ver, ethics also req uires e thical thinking, i.e. ethical reflection. Ethical reflection is base d on the pe rception of ethics, aswell as on the ethical judgment (Roberts & Wood, 2007). In order to have an ethical reflection, a practitioner should be able toperceive and identify the dilemma as a situation which involves ethics in some way. He/she should be able to apply one ormore ethical principles to this situation, should consider alternatives and shouldcome to a personal decision of how he/she willbehave. Thus, being ethical is something one does by ones own choice, beyond legal or moral prescriptions.

Defining ethics on the bas is of reflection and personal choice ha s a bearing on the course of this paper. We argue thatpractitioners should be prepared to judge professional situations and issues critically and creatively from an ethical point of view. In order to facilitate s uch a beh avior, a normative stance in this paper wou ld be of little if any help. It is not o ur intentionto cover a ll the pos sible combinations of conflicting principles w hich would de fine dilemmas requiring ethical reflection.

Instead, we hope to provide practitioners in testing with heightened aw areness to wha t we consider to be relevant ethicalcatego ries in the field of psychological testing, facilitating in this way ethical reflection w hen real-life situations have a be aringon these issues.

What is ethical? Right and wrong beyond the law

Developing competencies for ethical reasoning is important for testing practitioners, because ethics also has a bearing on thesocial respon sibility of testing professiona ls. Ethics is abo ut relationships, abo ut the place o f a se rvice to society, or, in a


2/13



broader sens e, about the place of a profession in society.

Ethics is at the core of virtually every discipline or profes sion, but is conside red of high importance for relatively few. Thestatement of ethics is thus a statement of social responsibility of a profession and at the same time a statement of personalresponsibility of those who practice that profession (Oakland, 2005).

Ethics in testing or, in a broa der se nse , in psychological wo rk, usually provides e xplicit norms of correct beh avior. Most nationalassociations of psychology have explicit ethical standards to which their members abide. Leach & Oakland (2007) havediscussed a number of 31 e thics codes impacting the p ractice of psychology in 35 countries an d found them importantbenchmarks for professional competence.

However, Pope & Vasquez (2007) urge that awareness of the existing ethics codes and formal standards, while crucial toprofess ional competence is not a subs titute for an active and de liberative approach to fulfilling the ethical respons ibilitiesrequired by profes sional practice. Ethics is thus more than b lind and indiscriminate adherence to a s tanda rd or law. Ethicsmoves the discussion towa rds right and wrong beyond the law and e thical judgment may and sho uld be done aside from thelaw and even in absence of a law. For example, ethical judgment enables psychologists and test users to be active incountries without formal standards in these domains (Leach & Oakland, 2009).

The statement that ethical judgment goes beyond the law addresses the basis for that specific judgment, which is not somuch the law, as the moral principle unde rlying it. Furthermore, we underline the fact that moral principles are notenforceable. While a law, or even an ethics code adopted by a na tional association may and will be e nforced a nd trespass ingwill and should be prosecuted in some manner, moral standards move beyond the possibility of an organization of enforcingthem.

Most countries around the globe have laws regarding the activity of psychologists in general However, there are very few

laws concerned with tests and testing. As ever more psychologists work cross-nationally and as psychology becomes evermore internationalized, it is important for international organizations with an interest in tests and testing to assumeleadership and to generate a normative body valid from an international point of view.

So far, psychologists, educators and other professionals active in testing have turned to the implicit moral standards which arethe foundation for laws and ethics codes. Until a relevant international association will provide a comprehensive set of rules,to form a standard or guideline targeted directly to tests and testing, the main body of principles governing ethical reasoningrelated to tests will be implicit and related to our deepest beliefs as human beings and as psychologists.

Some formal documents, which are the result of pioneering work and adherence to high standards of professional practice,have proven to be very influential with respect to the ethics of testing. We should mention in this respect the continuous workdone by the APA, which is reflected in its lates t form in the Standards for Educationa l and Ps ychological Testing (AERA, APA,NCME, 1999), a voluminous document of over 100 pages , discussing more than 250 principles o f ethical testing, grouped in 15categories. Also, we should mention the Principles for the validation and use of personnel selection procedures, published by

the Society for Industrial and Organiza tional Psychology (SIOP, 2003), which has proven to be a landmark for I/O testingaround the globe. However, in spite of their wide reach and pioneering farsight, these documents and others like them aretributary to local values, customs and practices and do not bring about adherence from psychologists around the world.

An important step forward was the exceptional leadership assumed by the International Test Commission, visible in a set of three standards related to test usage (ITC Guidelines on Test Use, Bartram, 2000), test adaptation (ITC Guidelines onAdapting Tests, Hambleton, 1994; Oakland 2005) and computer-based and internet-delivered testing (ITC InternationalGuidelines on Co mputer-Based and Inte rnet-Delivered Testing, Bartram & Coyne, 2005). Severa l regional codes o f ethics haveprovisions relevan t for tes ting. Examples include the common code of the five Nordic countries o f Denmark, Finland, Iceland,Norway, and Swede n (Nordic Psychologists Associations, 1998), the brief declaration of ethical principles of the four SouthAmerican countries of Argentina, Brazil, Paragua y, and Uruguay (Ethical Principles Framewo rk for the Profes sional Practice of Psychology in the Mercosur and Associated Countries; Ferrero, 2006) and the EFPA (European Federation of Psychologists Associations) Meta-Code of Ethics, which was approved in 1995 and revised in 2005 (Lindsay, Koene, vreeide, & Lang,2008). However, none of thes e codes and guidelines have the international coverage needed for their universal acceptance.

We feel that o ne do cument is es pecially relevant in this situation. The Universal Declaration of Human Rights (Gauthier, 2008),is founded on universal human values and enumerates universal human rights. This international document has a strongmoral basis and has become embedded in laws and ethics codes of a large number of countries and professions. TheUniversal Declaration of Ethical Principles for Psychologists has been adopted by the International Union of PsychologicalScience (IUPsyS) and th e Inte rnational Association of Applied Psychology (IAAP) in 2008 (Gauthier, 2008).

The Universal Declaration is the first document which has been approved internationally, by relevant organizations, stating aset o f general principles for the profes sion of psychology. The Universal Declaration cons ists of a preamble, which is followedby four principles, each deve loped into a number of 5 to 7 unde rlying values. The four broad principles are: Respect for theDignity of Pe rsons a nd Pe oples, Competent Caring for the W ell-Being of Pe rsons a nd Pe oples, Integrity and P rofessional andScientific Respons ibilities to Society. The o bjective o f the Universal Declaration is no t to provide a cross-frontier code of eth ics,but to pro vide a moral framewo rk and gen eric set of ethical principles for psychology organizations worldwide (p. 1). Indoing so, the Universal Declaration builds upon ethical principles that are based on shared human values, describing principlesand values that general and aspirational rather than specific and prescriptive (p. 1).

In view of this situation, where there are few if any laws regarding tests and none a re binding internationally, where judgment is rather moral tha n lega l and w here implicit and universa l moral values are captu red in re levan t interna tiona ldocuments, we believe that for a test or testing procedure to be considered ethic, in a broad sense, it should abide by theprinciples o f the Un iversal Declaration of Ethical Principles for Psychologists.


3/13



What makes a test ethical ?

Being ethical is a characteristic of a beha vior and not of a p roduct of behavior. As such, it is inapprop riate to discuss a testas ethical or unethical. Instead, this attribute may only be applied to the way a test is used. Even a test with the mostpeculiar characteristics, for example a test which does not cover the intended domain correctly or has too large of an error of measurement could be used in an ethical manne r. For example, a ne w te st, which still lacks exten sive proof of validity, may beused in an ethical manner if testee is informed in advance of its experimental nature and if the test is only used for low-stakedecisions, as an icebreaker, or in conjunction with other assessments. Therefore, the discussion should be about the ethics of testing or, in a broader sense, the ethics of psychological or educational assessment.

Another important point to be made is that a p ractice or proced ure is only ethical or unethical if defined as such by the code of ethics one profess ional abides (Leach & Oakland, 2007; vreeide , 2008). This specification is important in the context of internationalization and globalization of psychological practice, in which we often are tempted to force our own values and ourown ethical points of view upon other practitioners, from other areas of the world, where those values do not apply in thesame manner.

As such, a spe cific beha vior may be labe led as unethical by a certain code of e thics and still be ethical in the limits of anothe rone. Ethic codes are an expression of underlying values. We would all like to believe in universal values, and subsequently inuniversal ethical principles, but so far it has b een difficult to su m up ethical principles in universal, cross -national codes . Eventhough, as noted, several cross-national initiatives have been attempted and have had some impact in the internationalscientific community, it is the national laws and the codes of ethics of national organizations that most test users abide after.It would be wishful thinking to assume that we wont find disagreements between codes of ethics stemming from differentsources.

We should not conclude that one couldn't act ethically unless at least one code of ethics supports that specific behavior.Values or reified constructions of what is good or desirable dictate our evaluative reasoning regarding to ethics as muchas formal documents , like codes of e thics or law s. Still, a beha vior should only be dee med as e thical or unethical with regard tothe specific norm the psychologist adheres to, and not based on our own construction of what is correct.

As a result, it is virtually impossible to discuss in this pap er how ethical or une thical a certain behavior or procedure is, asthe legitimate question would arise: based on what code? Especially in a document such as this, published under thepatronage of the International Test Commission, the need to cover cross-national practice will take us to a continuousreference to the category of good practice.

We will discuss two main classes of characteristics which should be considered when discussing how good the practice of atesting proced ure is: (a) adhe rence to the gene ral principles of the scientific method and (b) characteristics of the interactionprocess betwe en the testing professional and the other stakeholders involved in the tes ting process.

(a). Adherence of the testing process to the general principles of the scientific method

Testing is conducted following the s cientific method. Broadly speaking, testing is carried out with the explicit purpose o f generating scientific data for decision makers. Because testing uses the scientific method, professionals in this area, be theypsychologists, educators, or other professionals, are called upon in order to apply the method to the best of their abilities.According to the ge nera lly accepted principles o f the scientific metho d, this means tha t testing sh ould be us ed in orde r togenerate the needed information in an objective, reliable and valid manner. Any usage of a test that does not adhere to thisprinciple could not be labeled as good practice.

Objectivit

Objectivity is the main principle o f the s cientific metho d. As an express ion of the scientific method , psychological andeducational testing should be as objective as possible. In testing, objectivity refers to inter-user consistency in the execution,scoring and interpretation of standardized assessment procedures (Westhoff & Kluck, 2008, p. 68).

Of course, complete ob jectivity is never poss ible and certain facts which are accepted a t a certain moment in time as correct, true or objective in a s cientific sense, are often overthrow n by other, new, scien tific discove ries. This is a cha racteristic of scientific reaso ning, which is conse nsua l by nature and reflects the sha red unde rstanding of the scientific community at acertain point in time, rather than truth in an absolute sense.

The principle of objectivity translates in the domain of tests and testing in three ways.

First, we will consider a test as being objective if the procedures for administration, scoring and interpretation arestandardized and constant across time, users and test takers (Kline, 1993). All test users should administer, score andinterpret the te st in the sa me way and all test take rs, indifferent of their characteristics or of the moment in time the test isadministered to them, should have the same opportunity to perform.

While administration and scoring may be easily standardized in order to be considered objective, interpretation always callsfor the professional judgment of the testing professional and as such inherently brings into equation subjectivity. The need tointerpret test data in an objective manner is acknowledged by one of the values of the second principle of the UniversalDeclaration of Ethical Principles for Psychologists, (f) self-knowledge regarding how their own values, attitudes, experiences,


4/13



and s ocial contexts influence the ir actions, interpretations , choices, and recommendations.

Second, objectivity refers to measurement w ithout error (Anastasi, 1997). Scientific knowled ge d evelops constan tly in itsattempt to minimize e rror in the human und erstanding of phe nomena. Thus, the principle of objectivity translate s into e thicsas a dedication to the latest state of scientific knowledge. This reflects into one of the values of the second principle of theUniversal Declaration of Ethical Principles for Psychologists, (e) developing a nd maintaining compete nce.

Specifically, for tes ts users this p rescribes the need to critically analyze the construction, administration, scoring andinterpretation o f tests, in order to a scertain if they are in accord with the late st scientific developments in the field.

The state of scientific knowledge has advanced radically in the last years in the area of test construction. New technologies

have emerged and it is nowa days often inappropriate to use a test in the construction of which these new technologies havenot been employed. We now have new ways of approaching the des ign of a test, as w ell as sophisticated sta tisticalprocedures , like Structural Equation Mode ling, Item Resp onse Theory, Differential Item Functioning and o thers. These andother procedures are an a ssurance a test author may give to the use rs of his/her test, as they could contribute to proving thetest user that the test he /she uses is up to the latest s cientific knowledge.

We may consider a procedure, be it a ne wly developed or an older one, as not representing good practice to the e xtent towhich it cannot live up to criteria impose d by current scientific understanding. Subseq uently, good practice in test usa ge w illfocus the preference of test us ers:

- towa rds newer procedures;

- towards using well-documented procedures;

- towards procedures (be they old or new) which have proven to live up to the latest developments in science, by empiricalevidence;

- towards using tests with new o r updated norms.

Third, objectivity is not simply a characteristic of a tes t, but also o f the situation. A tes t may be objective for a g iven situationor for a given popu lation and no t for anothe r; something called differential functioning. Ethical test usage calls for anevaluation of the behavior of a test across situations and across populations. Sometimes tests behave differently in differentcontexts or for different populations and there should be empirical evidence or reasonable theoretical backing in order to stateclearly wha t the psychometric featu res of a test a re for the ta rget popu lation it is use d in. Good pra ctice (i.e. ethical beha vior)in this respect w ill never just a ssume tha t a tes t performs well on a specific population or in a sp ecific situation, but will ratherask for scientific evidence that this is indee d so.

Reliabilit

Reliability is important for the topic of ethical testing, becaus e reliability describes the error as sociated w ith the measurement.Decisions base d on test scores s hould only be taken w ith a careful consideration of the error asso ciated w ith themeasurement of those test scores. While recent publications have started a modern debate on reliability (e.g. Thompson & Vacha-Haase , 2000; Dimitrov, 2002; Fan & Thompson, 2001), we w ill address th is construct here a s ou tlined b y AERA, APA & NCME (1999): Reliability refers to the consiste ncy of [...] measurements whe n the te sting procedure is repea ted on apopulation o f individuals or grou ps (p. 25).

Reliability poses at least two ethical questions. These are related to the dichotomy of high vs. low-stake decisions and to thedifferent types of reliability.

High vs. low-stake decisions. The first question relate d to reliability is a fundamental one : how much should we rely on theresults of the test? Again, the discussion is not on e of relying or not relying, but of how much to rely. The degree of relianceon the test result describes the limits of its ethical usage. If we may not rely heavily on test results, then they are useless asa basis for decision. Reaching decisions or feeding decision makers information based on unreliable data is unethical behavior.

Naturally, as our discussion underlines the degree of reliance, the following question arises: how much is acceptable? Whereshould we draw the line and view a specific test result, based on its low reliability, as being unethically used in a decision?Scientific consensus sets certain limits on reliability and describes the types of decision that are possible to be reached in acertain span of reliability. While a s atisfactory level of reliability depe nds on how the measure is being us ed (Nunna lly & Bernstein, 1994, p. 26 4), the gen erally recommended limits a re .70 a nd .90 (Nunnally & Bernstein, 1994 , p. 265).

High stake decision should never be reached if the reliability of the procedure used as a basis for decision-making falls belowthe .90 level. Low stake decision may be reached with scores which have reliability below the .90, but not below the .70 level.High vs. low stake d ecisions refers no t only to the impact of the decision, but also to the scope o f the decision: highly reliabletests are needed to sort individuals into many different categories based upon relatively small individual differences (e.g.intelligence), while lowe r reliability tests are s ufficient if the tes ts are used to so rt people into a smaller number of groups ,based on rough individual differences. Procedures with a reliability placed below the .70 level should be used only with theutmost care. The segmentation of decisions into high vs. low-stake places again emphasis on the situational aspects of ethicalbeha vior: it is only unethical to use a te st for a decision for which it is not qua lified by its reliability.

Still, in spite of these rathe r clear gu idelines, some situations req uire profess ional ethical judgment on beha lf of theprofessional. In many settings, for example in some I/O settings, tests represent pass/no pass hurdles. In these situations,the discussion around reliability gains a supplementary significance, and the professional using the test should probably try to


5/13



include in his/her de cision no t only the criterion of reliability, but also the validity of the tes t, and the poss ible costs for thecompany w hen s electing false pos itives. Also, the test user could inform decision makers on this dilemma and build togethe rwith them an ethical approach to this business case. The basic rule of this approach should be the principle that the usage initself of a sub-optimal test is no t unethical, if the limitations are kno wn a nd acknowledged in advance by decision makers.

Standard Error of Measurement. The reason for the high emphasis placed on reliability as a psychometric feature of a test isnot reliability in itself, but the ability of this characteristic to predict the distribution of the e rrors ass ociated w ith themeasurement, for the s pecific test (Dudek, 1979). The concep t used in this respect is Standard Error of Meas urement (SEM).SEM estimates ho w repe ated measurements of the same person on the s ame test are distributed a round his or her true score. The true score cannot be measured directly, because it is imposs ible to construct a test w hich is completely error-free.From a statistical point of view, SEM is defined as the standard deviation of errors of measurement that are associated withtest scores from a particular group of e xaminee s (Harvill, 1991, p. 1). From a logical point of view , SEM is directly related tothe reliability of a test. The more reliable a te st is, the smaller its SEM, i.e. the less e rror and the more precision is as sociatedwith the measurement.

The acknowledgment of imperfect measurement a nd the e xistence of SEM poise a difficult problem for test users ,conceptualized by the impossibility to look upon a test score as a true score. Instead, a measurement provides the test userwith a range, not with a score: the range w here the true score is placed, with a certain probability.

The obligation to operate with ranges of scores and not with scores brings the discussion into the field of ethics. Most often,decision makers need clear and unequivocal information, which has to form the basis of their decision. Operating with rangesof scores makes for example the comparison of two scores difficult, especially when they are close to one another.Differentiation be tween different scores, stemming from differen t test taker is thus jeop ardized. But failure on beha lf of thetesting specialist to recognize the nee d of ope rating based o n test scores and SEM will amount to unwarranted de cisions.Good practice will take the Standard Error of Measurement into account when communicating test scores or when reachingdecisions.

The problem is further complicated by the fact that decision makers are u sually not comfortable w ith using interval scores.Most are not trained in understanding confidence intervals and will discard this information, focusing on the obtained score.Ethical behavior and go od practice on behalf of the tes ting specialist will take into account this po ssibility. The sp ecialist willtake special precautions when reporting test data, will include SEM into reports in such a way as to make it impossible for thisinformation to b e discarde d by decision-makers n ot trained in psychometric theory, and w ill wa rn decision makers of theproblems associated with the use of obtained scores.

Wha t type of re liability? The second e thical question po sed by reliability addresses the type of reliability employed. Reliabilityof scores on a test is ass esse d through two fundamentally different approaches. One approach considers that a test is morereliable if there is a higher correspondence between different parts of the test. This is called internal consistency and ismeasured through such procedures as the Cronbachs coefficient alpha, split-half correlation, or the correlation of parallel

forms of the test. The other approach states that a test is more reliable if there is a higher correspondence betw een theresults of the tests obtained at different moments in time. This is called te st-retest re liability and is a correlation between thetwo administrations of the test.

This problem is more subtle insofar as many test specialists use the two types of reliability interchangeable, even though theyare clearly not so. Tests are always employed in order to give answers to specific questions. The type of reliability that shouldbe taken into account for a specific question is dictated by the theory behind the analyzed construct. If the theory addresses aconcept which should be relatively stable in time, then test-retest reliability should be measured. If the theory states that theparts of the test (i.e. items, subtests) should measure the concept in a similar way, internal consistency should be measured.The underlying theory tells us thus what kind of reliability should be measured, and there are some theories require the use of both type s o f reliability.

Reliability is re levant to ethics as failure to consider the correct type of reliability will most p robably ste m from a lack of adocumented application of the test. The test in itself will not be better or worse through this, but the conclusions and

decisions will possibly be ba sed o n a reasoning which will not be consistent w ith the intended use of the test, as prescribedby the unde rlying theory. There fore, good practice will take the correct type of reliability into account, adapted to the te stingsituation and the target construct.

Validit

The actual scientific understanding defines va lidity as a complex and integrate d corpus of scientific knowled ge anddemonstrations, which examines the psychological variables measured by a test. Validity refers thus to the degree to whichevidence and theory supports the interpretation of test scores (AERA, APA, NCME, 1999, p. 9). The knowledge and thedemonstrations are rarely collected in a single place and in a coherent manner, but most often are presented in variousformats, in various places. Examining the va lidity of a test requ ires an a ctive search and a n attentive examination of thepieces o f knowledge related to the tes t.

Validity is the most fundamental cons ideration in de veloping and evaluating te sts (AERA, APA, NCME, 1999, p. 9). Validitytells us what a te st measures a nd allows us to interpret the results of the test, to formulate descriptive conclusions andpredictions based o n test scores.

The first way validity is related to ethics is through the way a test is selected for usage. Test users have an obligation to onlyuse tests w hich have been sufficiently validated for the intended purpose of the testing and the intended target population.


6/13



There are no tes ts which are valid for all situations an d all populations. Validity is very much a situationa l aspect. As noted inAnastasi (1997), validity has a bearing not on the test itself, but on the interpretations of the test. Interpretations are verysituational and integrate along with test scores the objective of the assessment, personal, cultural and situational variables.Tests should only be used if there is enough e vidence supporting the benefits of using the test in relation to thes e othervariables.

Second, test users should only base their opinions and recommendations in assessment reports on data which offer sufficientvalidity in order to support these opinions. This has a bearing on the interpretability of test scores: aside from the obviousdescriptive step, test users often tend to make predictive judgments. They use test scores to predict future behaviors of thetest talker. These predictions should be based not only on logical or theoretical assumptions, but also on empirical evidence(such as p redictive validation studies ).

At most times, this also means tha t the tes t is by itself not sufficient to warrant a valid decision. Using test data for decisionsis partially indepen dent o f the tes t. The validity of decisions based on tes t data is thus different from the validity of the te st.Valid decisions need the test user to take into account other relevant sources o f information and to integrate the se w ith thetest scores. Especially in high stake contexts (AERA, 2000), we will consider good practice when test data is integrated withdata from other sources, in order to rea ch valid decisions. Still, as e thical judgment will focus on th e result and not o n theautomatic application of any principle, integration of tes t data with data from other sources should be cons idered carefully.There are times w hen da ta deducted from other sources is less valid than the da ta resulted from tests. In the integration of data, proper consideration sho uld thus be given to the w eighting of data.

Conclusions on the adherence of the testing process to the general principles of the scientific method

There are no perfect tests. Objectivity, reliability and validity are not switches with only on or off states. Instead, there aremany in-betwe ens from white to black. The psychometric characteristics of tests a re, the same as many other characteristics,distributed normally across the population of tests. It is not unethical to use a test which is placed at the average or evenunder the average of this distribution, if the test user understands the limitations of the respective test, if the usage is donewith a clear a nd complete understanding of the da ngers, and the caveats are accepted a nd taken into account by thespecialist using the test and if these drawbacks are communicated in a transparent way to the client. After all, a test is alwaysemployed to ans we r a spe cific ques tion and o ne of the main criteria for choos ing a spe cific test in a s pecific situation is thecost-benefit ratio for the particular clients question. It is, however, bad practice for a test user to ignore the need todocument on the procedure, as is the decision to ignore shortcomings, or to use tests with shortcomings withoutcommunicating those to his/her client. Profess ional decisions an d circumstances often a llow for the usa ge of a test w hich isless than perfect for the intended purpose. This in itself is not unethical behavior, with the condition that the testing specialistunderstands the caveats and that he /she explains the draw backs and cautions to his client.

Virtually all the principles a nd all the characteristics of ethical behavior or goo d practice discussed h ere a re a mix betwee ncommon sense and a high level of professional judgment. In order to follow ethical guidelines, a testing specialist not only hasto be a ware of ethical practices in the res pective area, but he or she a lso has to have a high level of professionalunderstanding, in order to be able to evaluate the technical implications behind his/her decisions.

(b). Characteristics of the interaction process between the testing professional and the other stakeholders involved in the testing process

Testing is an interactional process betw een the testing professional, the te st taker and the client of the testing process.

The client of the test us er (i.e. the de cision maker) is so metimes the teste d perso n himself or herse lf. Howe ver, at other times,as in the case of te sting in the field of I/O, forensic, educationa l or clinical psychology, the tes t taker is different from thedecision maker. The testing professional has ethical responsibilities towards both categories of stakeholders.

FairnessFrom this interactional point of view, the concepts of ethics and fairness are often interchangeable. Even though fairness ismuch closer to the everyday language and thus closer to the test taker, the concept of fairness is used in many differentways. For example, even though it discusses four different meanings of fairness in testing (fairness as lack of bias, fairness asequitable treatment in the testing process, fairness as equality in outcomes of testing and fairness as an opportunity tolearn), the Standards for Educational and P sychological Testing (AERA, APA, NCME, 1999, p. 80) state th at conse nsus on w hatis and what is not fair has not been achieved in the professional community and even less so in larger society.

Ultimately, fairness is a mental construction of the te st taker a nd the re levant community and as such it is subject to manyinfluences . As all other mental constructions of the test taker, the construction of fairness may be influenced by the te st use rthrough the way he/she communicates with relation to the test itself and to the testing procedure.

Procedural justiceEven w hen te sting is perceived by the te st taker a s be ing objective, reliable and valid (and thus scientifically correct), theperception of the test taker regarding the control he/she has upon the outcome of the test may vary widely.

The as sumption that outcomes drive the evaluation of a certa in event is pervas ive in the social sciences (Lind & Tyler, 1988);according to this assumption, people judge their social experiences in terms of the outcomes they receive. Attitudes towardstests and testing could thus be explained by these outcome-based judgments (Ambrose & Rosse, 2003). Contrary to thisbelief, process based models assume that the psychological construction (perception) of an event is not only driven byoutcome, but also by the process itself. The main postulate of these models is that people not only care of the allocations, but


7/13



they also care how allocations are made. Process based models are relatively new (Thibaut & Walker, 1975). Also, there issome tension between outcome-based and process-based models (Lind & Tyler, 1988). The concept of procedural justice iscentral to research conducted in the relatively new process-based tradition.

This tradition differentiates between objective and subjective procedural justice. Objective procedural justice focuses onthe characteristic of a procedu re to conform to normative stan dards o f justice (Kaplan, 1986; Kassin & Wrightsman, 1985).Objective procedural justice is enhan ced by reducing clearly unacceptab le bias or prejudice, and ps ychological testing ha slonged moved awa y from clearly unacceptable practices. In order to maximize o bjective procedu ral justice we sh ould focus onwha t is seen in normative do cuments (like, for example, the Standards for Educational an d Psychological Testing) as rights of test-takers, among them the right to ha ve the procedure e xplained, the right to be teste d in ones own language e tc.Subjective procedural justice concerns the capacity of a procedure to enhance the fairness judgments of those who encounterthe procedures. This is the area we will focus on in the following discussion.

Generally, procedural justice discusses the judgments of people that procedures and social processes are just and fair (Lind & Tyler, 1988). Specifically, in psychological and educationa l testing, procedural justice discusse s the judgments of tes t takersregarding how just and fair a certain procedure they are subjected to, is (Konovsky, 2000).

The questions which arise in this context are multiple and apply to the whole process of testing: how a test is chosen, how atest is administered, how the test is scored and interpreted. Subjective procedural justice is enhanced by promotingcommunication and transparence in the testing process and by allowing the test taker to get involved and take someresponsibility in the different phases of the testing process (Ambrose & Rosse, 2003): choosing the test, administering thetest, scoring the test, interpreting the test. All these behaviors are to be considered by us as good practice.

More questions arise in this respect, specifically for every phase of the testing process.

Should test takers have a saying in the decision to choose a certain test over a nother? Clearly, most test takers are no tqualified to make this decision, as they have neither knowledge of the possible options for test usage in a testing context, northe necessary technical expertise to reach a valid decision. The decision to use a certain test over another is taken by thetesting specialist, based on the objective of the asses sment, after which he/she usually narrows down a larger list of possibletest options. Many times the final decision to choose a test over another from a short list is made in an arbitrary manner.Involvement of the test taker in at least this final decision will sometimes not damage the assessment process, but boost thesubjective feeling of procedural justice he/she will experience. Certain testing settings are more amenable to the applicationof this principle then others. For example, test taker involvement can be done easier in an educational setting, when the goalis the measu rement of vocational interests , but it can be extremely difficult or even unfeas ible in a high stakes selectioncontext.

Should test takers have a sa ying in the time or mode of administration? At many times, there is no possibility for the test us erto accommodate for the schedules of every test taker. In I/O or educational contexts, when testing is sometimes a large-scaleand carefully scheduled procedure, accommodating the w ishes of test takers w ill be impossible. At other times, flexibility ispossible and is an important signal of cooperation. In most testing procedures there is at least a possibility of allowing thetest taker a decision between alternative options for the time of testing. The test user should allow for as much liberty aspossible, without damaging the assessment process, on behalf of the test taker. Also, the current state of technology hasenabled many test authors and tes t editors to offer more than one standard w ay of administering a procedure. Many testsmay be administered with the same results, in a standardized manner, in different ways, for example paper-and-pencil orcomputer and Internet-based. Even though sometimes the test user may have a personal option he/she would like to use, if there is consistent evidence that the mode of administration has no bearing on the results of the testing process, allowing thetest taker to choose between paper-and-pencil vs. electronic testing will enhance the subjective feeling of procedural justicethat the test taker will develop.

Should test takers be allowe d to as k for a retest or re-scoring? It used to be considered a proof of good practice andcoopera tion, especially in educationa l testing (AERA, 2000) to allow te st takers to ask for re-scoring. Information technologyhas made scoring errors very unlikely esp ecially when the te st is scored e lectronically. Also, hand scoring forms have become

more and more sophisticated and have for many tests embedded ways of easily checking if the scoring has been donecorrectly. However, even tho ugh scoring errors ha ve become more and more unlikely, test us ers inquiries for re-scoringshould be accommodated. At least in psychological testing (for example in personality assessment), the feedback of the testtaker that he/she does not acknowledge the result of the test as be ing descriptive of his/her person should be reas onenough for the test user to check the scoring. Sometimes test takers ask for a retest, i.e. for the repeating of the wholetesting proces s. Certain domains, like cognitive ability testing or knowledge te sts, to only name a few , involve learning as a nimportant va riable in test performance and retes ting would clearly go aga inst reas onab le accommodation. Other domains, likepersonality testing, would most of the time allow for retesting. Such a concession on behalf of the test user would certainlyheighten the test takers subjective feeling of procedural justice.

Should test takers be allowe d to ge t involved in the interpretation of the test? For most testing areas, test takers do not havethe technical background required to play an active part in the interpretation of test results. It is, nevertheless, consideredgood practice to involve the test taker in the interpretation of the results in such a way as to reach a common understandingbetwee n test user and test taker on the meaning of test scores. The process o f feedback to the test takers has as one of itsmain targets the construction of such a shared understanding. Subsequently, test takers are and should be involved assecondary parts, with a rather passive role, in the construction of meaning from test scores.

Sharing of resultsTest users have a responsibility to the stakeholders in the testing process. One of them is the test taker him/herself.Sometimes, the test taker is the only stakeholder. Most times, there are several stakeholders such as relatives of the test


8/13



taker, parents of minors, supervisors or employers in organizational settings, attending doctors, psychiatrists, teachers andother professionals in educational settings, representatives of law enforcement, correctional or court personnel etc. Whom toshare the results of a test with is sometimes a difficult question, which clearly stands in the area of ethics.

Different codes of ethics approach this aspect in different ways. Some do not address it at all, but there is a significantdifference in approach even betw een those codes which address this issue. For example, Leach & Oa kland (2007) haveshow n that clients have g reate r access to te st da ta in the U.S. than in South Africa. Even in the U.S., different s tates rule indifferent manners on this point, and even in the same state the same court may rule differently on different occasions, asshow n by the de cisions o f the Ca lifornia Supreme Court on Taras off v. Board Of Regents (Leach & Oakland, 2009). In thisrespect, as vreeide (2008) shows, the connection between ethical and legal systems is important and it is much more likelythe latter will be followed in issues related to the sharing of results.

The right to have the results communicated to him/her is a funda mental right of the test taker. However, sometimes, the rightof the test taker to be informed acts against the very purpose of testing. This may happen for example in the fields of forensicpsychology, clinical psychology and (in some case s) I/O psychology.

Wha t other s takeho lders to sh are re sults w ith is a difficult decision to make. Even more difficult is it to make the de cisionwhich information, with what depth, to share with whom.

Psychologists adhere from this point of view to diverging principles. On one hand, psychologists adhere to the principle of Respect for the Dignity of Pe rsons a nd Pe oples (Principle I of the Universa l Declaration of Ethical Principles for Ps ychologis ts),with e xplicit values like e ) privacy for individuals, families, groups , and communities and f) protection of confidentiality of perso nal information, as culturally defined and relevan t for individuals, families, groups, and communities. On the other ha nd,psychologists adhere to the principle of Competent caring for the well-being of persons and peoples, manifested in valueslike a ) active concern for the we ll-being of individuals, families, groups , and communities; b) taking care to do no ha rm toindividuals, families, groups, and communities; c) maximizing benefits and minimizing potential harm to individuals, families,groups, and communities. As we see, the values and principles of good practice urge the psychologist to respect privacy,while at the sa me time sugge sting and enforcing the nee d of the tes t user to sha re results with other stakeholders aside fromthe te st taker, whe n goo d is done by this, or for the be nefit of minimizing harm to the test ta ker or to othe rs. This could be thecase for example by wa rning victims of po ssible ha rm. Discussions rega rding the duty to protect are complex, and includedifferent aspects of risk and protection management, as shown by Werth, Welfel & Benjamin (2009) and Leach (2009).

One o f the possible solutions to th is problem is informed consent, which will be discussed in a later se ction. Thus, aside fromthe situa tions prescribed b y the law, which differ from country to country, sharing of data s hould only be do ne a ccording to thewishes of the test taker and with the explicit consent of the test taker. The procedure for collecting informed consent includesenumeration of the uses the test data will have and of the persons these da ta will be shared with. It is therefore a case o f good practice to share test data only with the pe ople the test ta ker agreed to s hare w ith, prior to the tes ting. Asking the testtaker for consent on the sharing of data after the testing is not as unintrusive, as a consent elicited in this way might be

considered as forced on the test taker (though in a subtle manner) and might involve mechanisms aside from the real wish of the test taker to share.

Communication and reporting of resultsAs noted abo ve, the right to have the results communicated to him/her is a funda mental right of the te st taker. This right isexplicitly stated as s uch by many codes of ethics and is sugges ted, though no t explicitly stated , by the Universa l Declaration o f Ethical Principles for Ps ychologists. However, stan dards a re unclear rega rding whom to communicate, wha t to communicateand in what manner to communicate test results.

The obligation of the test user to communicate test results is set by the provision of adherence to the value of a) honesty,and truthful, open and accurate communications. Post-testing feedback to the test taker, on the results of the testingprocess and their meaning is considered good practice (Pope, 1992). Feedback to the test-taker encourages mutualcommunication and no t only has an important role for the test-taker in clarifying the meaning of the results, but also animportant role for the test us er, for validating the interpretation he /she came to (Aiken & Groth-Marnat, 2005).

However, sometimes situations arise when the discussion of test results with the test taker may seem inopportune. In clinicalsettings, as well as in forensic and correctional settings, disclosure of test results to the test taker is a decision the test userhas to balance betw een his obligation to the test taker him/herself and to the client, who is often times someone else otherthan the test taker.

Also, there are situations when complete disclosure may seem dangerous or inappropriate. At least two of the values in theUniversal Declaration of Ethical Principles for Psychologists apply to this difficult situation. First, psychologists ab ide by thevalue of g ) respect for the ability of individuals, families, groups, and communities to make d ecisions for themselves and tocare for themselves and each other. In light of this value, psychologists should prefer disclosure of data to the test taker,trusting his/her ability to understand the results and respecting his/her right of making his/her own decision regarding theresults of the testing process.

Also, psychologists accept the values o f b) avoiding incomplete disclosure of information unless complete disclosure isculturally inappropriate, or violates confidentiality, or carries the po tential to do serious harm to individuals, families, groups, orcommunities. Disclosure of data shou ld thus be complete. Incomplete disclosure s hould be a voided. Complete disclosure willbe avoided in all those situations when the receiver of feedback is not the test taker him/herself and complete disclosure willviolate confidentiality. Also, complete disclosure to the test taker him/herself will be avoided w hen the re is poten tial that thedisclosure will do harm to the test taker.

A free-floating discussion in light of these values gravitates around the adjective complete, as applied to disclosure.


9/13



Should all test data be released? Should this include item answers, raw scale scores, un-interpreted (visual) reports, or onlyinterpreted, verbatim information?

Some of the concerns regarding the release of low-level test data, such as item answers, is closely connected to the problemof test s ecurity, which will be discussed in a later section of this paper. For example, the APA Statement o n the Disclosure of Test Data (http://ww w.apa .org/science/disclosu.html) specifically points to th is concern. Still, on the other h and, und er theAmerican Psychological Association Ethics Code (2002), documents containing the responses of test takers are ordinarilysubject to disclosure, considering that the virtues of secrecy regarding testing are often exaggerated and its vicesunderestimated (Erard, 2004).

Release of un-interpreted data, like raw scores, or visual reports, is coupled with concerns regarding user qualification and the

technical ability of the test taker to understand these un-interpreted data. In this regard, test takers are not technicallycompetent to understand raw data, and the limitations of these data, making it thus possible for the results to bemisinterprete d and misuse d, and to lead to misguided decisions w ith harmful effects. In light of the value of c) maximizingimpartiality and minimizing biases , held by the Universal Declaration of Ethical Principles for Psychologists, tes t use rs sho uldonly disclose data w hich may be understood by the test taker and make all efforts to e nsure that the data ha s be en correctlyunders tood, including its limitations.

The situation is further complicated by the fact that, as w e w ell know, disclosure of all the data for an individual is most of thetime pointless, without revealing the mean, standard deviation or other characteristics of the distribution of scores for theintended reference group. Also, there are different methods of using test scores (e.g., top down; multiple hurdle; banding)and data from a single individual mean nothing without the knowledge regarding the underlying algorithms and cut-off scoresused in the de cision. Should these data be sha red? Is it even realistic to expect the test taker to understand a nd use in acoherent manner this kind of information? It is our view that data of this depth is an inherent part of the testing procedureand should as such be protected by the principle of test security, which is discussed in a later section. In cases like this, testusers should question the usability of any such information for the testee and other stakeholders. For example, sharing with atest-taker the fact that he/she failed to pass a test by 1-point could make him/her upset, angry, and maybe eager to sue. Insome settings, like for example in educational testing where no decisions are being made, it is perhaps worthwhile to sharethis information, while in other s ettings, like a hiring context, this would be he lpful neither to the company nor to the tes tee .Subsequently, the depth of information to share should be carefully considered and ballanced with respect of usefulness andfairness for all stakeholders involved.

The obligation to communicate tes t results to the test ta ker is treated by many tes t users in a mechanical manne r, whichabuses the underlying principle s tated a bove and cannot be labeled as good practice. Anastasi (1997, p. 543) notes thatpsychologists have to ap proach communication of the tes t results in a form that will be meaningful and use ful to therecipient. Po pe, Tabachnick & Keith-Spiege l (1987) urge tha t feedba ck to the test ta ker and the communication of resultsshould be approached in a professional manner. Hood & Johnson (1997) set the frame for a professional approach in feedbackby explaining that feedback should be approached in conjunction with the intended purpose of testing and in light to the very

specific questions that have been raised and for the solving of which the testing was employed in the first place.

Pope (1992) discusses 10 fundamental aspects of the feedback process, which would qualify communication of test results asa go od practice, among them the framing of the fee dback and acknowledging fallibility. The framing of the feedba ck willhave serious implications on the way the data and implications are received and integrated, being influenced not only byevident variables, like order of presentation or language, but also by the tone of voice and other subtle mechanisms.Acknow ledging fallibility is crucial by making the te st taker awa re of the limitations of the da ta an d of the po tential sources of bias or e rror in the results. In accord to the APA ethical principles (1992), we will consider it a case of best practice forpsychologists to indicate any reservations they have regarding or any limitations they see in the result presented.

Informed concent

The testing process may be envisioned as a complex consulting process which requires communication on the delivery end,and also on the inception end. Clear and straightforward communication of the scope and goals of the testing process willstraighten out expectations for both parties involved and will set the stage for a correct relationship.

This is even more important in testing and assessment than in other psychological work as testing could involve an invasion of privacy (Anastasi, 1997). Psychological testing more so th an educational tes ting runs this risk, becaus e psychological testingdoes not require only performance but also self-disclosure, and self-disclosure is an invasion of privacy. The test user shouldconsen t or refuse th e spe cific invasion of privacy that is implied by the tes t administered. Such an a pproach as a b asis fortesting w ill also ha ve pos itive implications up on the outcome of the tes ting, as it will minimize pres enta tion bias an d othe rforms of faking (Aiken & Groth-Marnat, 2 005).

Informed consent is formulated as a n underlying value of psychological work and thus also of psychological testing by theUniversal Declaration of Ethical Principles for Psychologists, which as pa rt of Principle I, Respect for the Dignity of Perso ns a ndPeoples, states that psychologists follow the value of d) free and informed consent, as culturally defined and relevant forindividuals, families, g roups, and communities.

Informed consent is not o nly a legal provision indeed, in some codes of ethics it is not mentioned and some psychologistsare no t even a wa re of it (Iliescu, 2008). Still, we consider it a goo d practice for psychologists to resp ect the right of clients tohave full explanations of the nature and purpose of the techniques in language the client can understand (APA, 1992).

In certain testing settings, informed consent should be viewed in a narrower manner, as the consent part is forcibly limited.


10/13



For example, in the I/O arena, refusing consent to take an employment test would usually be grounds for rejection in mostcountries. Or, in the educational arena, refusing consent to take a knowledge test is of course grounds for failing the exam.Still, even in these settings and others, where consent is presumed or enforced, the testee has a right to be informedcorrectly of his/her rights as a test taker and should have the scope and goals of the testing process communicated tohim/her.

Informed consent is not a mechanical collection of a verbal acknow ledgement or o f written consen t forms. Informed consentnot only has a consent part, but also an informed part, which relates to the right of test takers to be informed, prior totesting, about relevant issues related to the testing process and the test itself (APA, 1998).

In respe ct of the value of respe ct for the ability of individuals, families, groups , and communities to make de cisions for

themselves and to care for themselves and e ach other, test takers have a right to understand not only that an a ssess ment isconducted, but also w hy an a ssess ment is conducted, what procedures or techniques it will require, why these proceduresare needed and what the outcome will be. In discussing the outcome of the testing, issues like disclosing of data, sharing of results, and feedback will be approached.

Informed consent has the status of a psychological contract between the test taker and the test user. As such, it may easilybe given and received in verbal form. APAs The Rights a nd Respo nsibilities of Test Takers: Guidelines and E xpectations (1998) and other similar documents state that test takers have the right to receive a brief oral or written explanation prior totesting. Albeit brief in coverage, the information procedure which precedes the consent or refusal by the test taker is mostoften th an n ot rich in information. As su ch, it is o ften important to formalize this p sychological contract and collect informedconsent in written form. Both the information and the consent part are under these circumstances less prone tomisinterpreta tion. We w ill consider it a case o f good p ractice if the informed consent is collected in written form.

Test securit and user qualifications

Test se curity is a major ethical concern of psychologists and is included a s such in many codes of eth ics. Even thoug h theUniversal Declaration of Ethical Principles for Psychologists ha s no direct corresp ondent for this ethical issue , more o f itsprinciples and values wou ld apply to test s ecurity. Reaso ns w hy test se curity is of importance from an e thical point of view a remultiple and we w ill briefly discuss some of the important po ints. As a preamble, though, it should be no ted tha t, even thoughresearchers who are active in the field of ethics note that psychologists and others commonly abuse this requirement (Oakland, 2005), there are very few pieces of research published on practices related to test security.

The performance of a test taker on a test may be considered a valid reflection of the target construct only if the test taker hasbeen assessed in a controlled manner. In this case, controlled refers, among others, to having no prior exposure to the itemsof the test. This is especially important in such testing situations where learning or training on the test items could enhancethe performance of the test taker. One of the main reasons why test security is of great concern is that by controlling a test,

the professiona l community ensures the viability of testing with the re spective instrument. Restriction of the reproduction an ddissemination of psychological materials has thus a professional reason, related to the viability of a measure.

Test security also has a strong legal reason. Test authors and test editors are entitled to financial compensation for their workin developing and publishing the tes t. Issues o f copyright and of po tential infringement a rise from this point of view. Somecountries have very strict regulations with respect to copyright, while others ignore this issue altogether. Leach & Oakland(2009) conside r that goo d practice in psychological testing shou ld abide from the point of view of copyright of twointernational documents which have extensive legitimity and provide a sound background: the Universal Copyright Convention(portal.unesco.org/culture/en/ev.php-url_id=1814&url_do=do_topic&url_section=201.html) and the Berne Conven tion for theProtection of Literary an d Artistic Works(http://en.wikipedia.org/wiki/Berne_Convention_for_the_Protection_of_Literary_and_Artistic_Works).Third, there is a strong reason related to the quality of psychological measures, which relates to test security. The translationor adaptation of tests without the approval of the current holder of copyright, be it the author or the test publisher, is also aviolation of international copyright treaties (Leach & Oakland, 2007). Test authors and publishers usually control testadaptation in a careful manner, in order to make sure that the adapted version lives up to the heritage of the original test, i.e.has resonable equivalence with the original (van de Vijver & Poortinga, 1991). Illegal test adaptation is a disturbinglygene ralized practice across the globe, especially in developing countries (e.g., Iliescu, 2008).The advent of the Internet and related technologies has put psychologists in face of new problems to these old issuespertaining of tes t security. Information pres ente d over the Inte rnet is much more volatile, which enhan ces the risks. But thepotential advantages of this pervasive technology should not be discarded easily. Responsible professional associations haveprepared statements and guidelines for the use of professionals in the interaction with these issues (e.g. British PsychologicalSociety Psychological Testing Ce ntre, 2002; Naglieri et al., 2004). Of internationa l importance is in this respe ct the initiative of the International Test Commission, which published its Guidelines for Computer-based and Internet delivered testing (Bartram& Coyne , 2005). This document add resse s, in addition to technology, issu es like qua lity, control and security, all of wh ich haveclearly as a fundament ethical concerns.The Internet has generated new challenges, which are technological in nature. For example, there is strong expectation fortest publishers and other organizations w hich maintain testing sites to support thorough security, user access and dataprotection on their websites. Client-side, good practices in testing would require prevention of unauthorized copying of testingcontent by the u nproctored te st taker (Bartram, 1999). But aside from these techno logical challenges, testing over theinternet does not really raise new issues, but only places old issues in new containers (Naglieri et al., 2004).The Internet has not only generated new problems regarding the testing process, but also potential pitfalls in face of globalization an d increased ease o f commercial transactions. Gregoire & Oakland (2008), for example, have recognized thedanger pose d by free and una uthorized transaction of protected materials over the Internet and urged e Bay and other


11/13



companies to establish and maintain standa rds that prevent the unauthorized sale of te sts and other professionallyprotected materials.The formulation of the subject of our current discussion, test s ecurity, calls for a continuation. One secures a te st against something. Tests are secured against unqualified users. This is why test security also raises the issue of qualified andcompetent usa ge of tests.We will not discuss test user qualification in detail. It is however important to state that not all countries have clear rules andregulations re lated to te st use r qualifications. A clear regulation from this point of view should no t only prescribe levels of qualification, but also define s pecific competencies (i.e. know ledge and skills) for every level of qua lification. We consider APAs(2000) Report of the Task Force on Tes t User Qua lifications an important document which could guide the de velopment of nationa l guidelines and w hich could define the limits of goo d practice in this domain. Another important do cument is the EFPA

(European Federation of Psychologists Associations)-EAWOP (European Association of Work and OrganizationalPsychologists) joint initiative entitled European Test User Standards for test use in Work and Organizational settings (2005).Though on ly limited to te st usage in the do main of I/O Psychology, this initiative is of cross-na tional impact.In light of the above, we will consider it a case of good practice when user qualifications are treated in a transparent andstructured manner, with a clear and theoretically founded rationale of the breakdown in levels of qualifications and a cleardescription of knowled ge and skills for every level.As part of the discussion regarding qualified us age of test, two s pecial classe s of po tential test users a ttract attention.One is the case of students a nd others w ho study to become specialists in testing, but are not yet qualified to use tests. Thesituation is insofar difficult, as the se p eop le ha e to use tests in order to become proficient, yet on the o ther hand are not yetqualified to do so. The only dedicated document covering this situation, as to the best knowledge of the authors, is the APACommittee on Psychological Tests and Assessment (1994) Statement on the Use of Secure Psychological Tests in theEducation of Graduate and Undergraduate Psychology Students. This short document covers best cases guidelines regardingfour main areas: security of test materials, testing demonstrations, teaching students to administer and score tests and usingtests in resea rch.Another interesting discussion in this respect regards the usage of tests by the tests taker themselves. Ethical discussionsarise in this respect and cover questions like: Can test takers be trusted to self-administer? Shall they be trusted to self-score? Are they qualified to do either? Is it possible for them to misinterpret the results? Is it possible for them to understandissues of test s ecurity? Will they absta in from training othe rs? Is it ethical for psychologists to d esign and market tests for theusage of test takers themselves? If yes, under what circumstances? These questions are not hypothetical, as ever more testsare self-administered and self-scorable. We have good examples of successful tests of this category in the area of vocationalcounseling and personality.

References

AERA (2000). Position Statement on High-Stakes Testing in Pre -K12 Ed ucation. Adopted July 2000.AERA, APA, NCME (1999). Standards for educationa l and p sychological testing. Was hington, DC: AERA.Aiken, L. R., & Groth-Marnat, G. (2005). Psychological Testing and Assessment (12th Edition). Upper Saddle River, NJ: Allyn & Bacon.Ambrose , M. L., & Rosse , J. G. (2003). Procedural Justice an d Pe rsona lity Testing. Group & Orga nization Man age ment, 28(4),502-526.American Psychological Asso ciation (2002). Ethical principles o f psychologists a nd code of conduct. American Psychologist, 57,1060-1073.Barnhart, M. (2002). Introduction. M. Barnhart (Ed.). Varieties o f ethical reflection: new directions for ethics in a global context.Lanha m, MD: Lexington Boo ks.Bartram, D. (1999) Testing and the Inte rnet: Current rea lities, issues and future poss ibilities. Keynote p aper for the 1999 Tes tUser Conference.Bartram, D. (2000). Interna tional Guidelines for Test Use. Pu nta Gorda , FL: ITC.Bartram, D., & Coyne, I. (2005). Internationa l Guidelines o n Computer-Based and Internet-Delivered Testing. Punta Gorda, FL:ITC.Bartram, D., & Hambleton, R. (Eds.) (2006). Computer-base d tes ting and th e interne t: Issues and a dvances. New York: JohnWiley.British Psychological Society Psychological Testing Ce ntre (2002). Guidelines for the Development and Use o f Computer-basedAsses sments. Leiceste r: British Psychological Society.Dimitrov, D. M. (2002). Reliability: Arguments for multiple perspectidves a nd p otential p roblems with gene ralization acrossstudies . Educational and Ps ychological Measurement, 62(5), 783-801.Dudek, F. J. (1979). The continuing misinterpreta tion of the stan dard e rror of measure ment. Psychological Bulletin, 86(2), 335-337.EFPA, EAWOP (2005). European Test User Standards for test use in Work and Organizational settings.[http://www.efpa.eu/download/2d30edd3542f33c91295487b64877964].Fan, X., & Thompson, B. (2001). Confidence intervals abo ut score reliability coefficients, plea se: An EP M guidelines e ditorial.Educational and Psychological Measurement, 61, 517-531.Gauthier, J. (2008). Universal declaration of ethical principles for psychologists. In J. E. Hall, & E.M. Altmaier (Eds.), Globalpromise: Qua lity assurance a nd accountability in professiona l psychology (pp. 98-105). New York: Oxford University Press .Gregoire, J. & Oakland, T. (2008). On the nee d to secure psychological test materials. [http://intestcom.org/archive/ebay.php].Harvill, L. M. (1991). Standard Error of Mea surement (A NCME instructiona l module). Items, 9, 181-189.Hood, A. B., & Johnson, R. W. (1997). Asse ssment in counse ling A guide to the use of psychological asses sment procedures(2nd Edition). Alexand ria, VA: American C ounseling Asso ciation.


12/13



Iliescu, D. (2008, July). Romanian psychologists view on ethic test usage. Paper presented at the 29th International Congressof Psychology (ICP), Berlin.Keith-Spiegel, P., & Koocher, G. P. (1995). Ethics in psychology: Professiona l standa rds an d case s. Londo n: Lawre nce Erlbaum.Kline, P. (1993). The handboo k of psychological testing. London: Routledge.Konovsky, M. A. (2000). Understa nding Procedura l Justice and Its Impact on Business Organ izations. Journal of Manage ment,26(3), 489-511.Leach, M. M. (2009). Internationa l ethics codes and the duty to pro tect. In J. Werth, E.L. Welfel, & A. Benjamin (Eds.), The dutyto protect: Ethical, legal, and profes sional conside rations in risk asse ssment an d intervention. Was hington, DC: AmericanPsychological Association.Leach, M.M., & Oakland, T. (2007). Ethics sta ndards impacting tes t development a nd use: A review of 31 ethics codes

impacting practices in 35 countries. Internationa l Journal of Testing, 7, 71-88.Leach, M.M., & Oakland, T. (2009). Displaying Ethical Behaviors by Psychologists w hen Standards are Unclear. Manuscriptsubmitted for pub lication.Lind, E. A. & Tyler, T. R. (1988). The social psychology of procedural justice. London: Springer.Lindsay, G., Koene , C., vree ide, H., & Lang , F. (2008). Ethics For European Psychologists. Gottingen: Hogrefe & Huber.Naglieri, J. A., Drasgow, F., Schmit, M., Handler, L., Prifitera, A., Margolis, A., Velasquez, R. (2004). Psychological Testing on theInternet, New Problems, Old Issues. American Psychologist, 59, 150-162.Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.Oakland, T. Selected ethical issues relevant to test a dapta tions. (2005). In Hambleton, R., Merenda, P., & Spielberger, C.(Eds.). Adapting educational an d psychological tests for cross-cultural ass ess ment. Mahw ah, NY: Erlbaum Press.Pope , K. S. & Vetter, V. A. (1992). Ethical dilemmas encounte red b y members of the American Ps ychological Asso ciation: Anationa l survey. American Psychologist, 47, 397-411.Pope , K. S. (1992). Respon sibilities in Providing Psychological Test Fee dback to C lients . Psychological Assessment, 4 (3), 268-271.Pope , K. S., & Vasquez , M. J. T. (2007). Ethics in Psychotherapy and Couns eling (3rd Edition). NY: Jossey-Bass.Pope , K. S., Tabachnick, B. G., & Keith-Spiege l, P. (1987). Ethics of P ractice: The Beliefs and Behaviors o f Psychologists asTherapists. American Psychologist, 42(11), 993-1006.Roberts, R. C., & Wo od, W . J. (2007). Intellectual Virtues: An Essa y in Regulative Ep istemology. Oxford: Oxford UniversityPress.Rolls, S., & Feltham, R. (1993). Practical and pro fessiona l issue s in computer-based asse ssment and interpretation.International Review of Professional Issues in Selection, 1, 135-146.Society for Industrial and Organizationa l Psychology (SIOP) (2003). Principles for the validation and use of pe rsonne l selectionprocedures (4th ed.). Bowling Green, OH: Author.The APA Committee on P sychological Tests and Ass ess ment (CPTA) (1994). Statement on the Use of Secure Ps ychologicalTests in the Education of Graduate and Undergraduate Psychology Students. [http://www.apa.org/science/securetests.html]The APA Test Taker Rights an d Respo nsibilities W orking Group of the Joint Committee on Tes ting Practices (1998). The Rightsand Respons ibilities of Test Takers: Guidelines and Expectations. [http://www .apa.org/science/ttrr.html].Thibaut, J. & Walker, L. (1975). Procedural Justice: A Psychological Analysis. Hillsdale, NJ: Erlbaum.Thompson, B., & Vacha-Haase, T. (2000). Psychometrics is da tametrics: The te st in no t reliable. Educationa l and P sychologicalMeasurement, 60, 174-195.Tippins, N., Beaty, J., Drasgow , F., Gibson, W . Pea rlman, K., Segall, D., & Shepherd, W. (2006). Unproctored Inte rnet te sting inemployment settings. Pe rsonne l Psychology, 59(1), 189-225.Tyler, T. R. (1989). The Ps ychology of Procedura l Justice: A Test o f the Group Values Model. Journal of Pe rsona lity and SocialPsychology, 57, 330-338.van de Vijver, F. J. R., & Poortinga, Y. H. (1991). Testing across cultures. R. K. Hambleton & J. N. Zaal (Eds.). Advances inEducational and Ps ychological Testing. Boston: Kluwer Academic Publishers , 277-308.Westhoff, K., & Kluck, M. L. (2008). Psychologische Gutachten [Psychological reports] (5th E dition). Heidelberg: Springer.

Suggested hyperlinks to web pages

http://www .apa.org/science/disclosu.html; APA Statement on the Disclosure of Tes t Data.http://www.efpa.eu/download/2d30edd3542f33c91295487b64877964; EFPA, EAWOP (2005). European Test User Standardsfor test use in Work and Organizational settings.http://www .apa.org/science/securetests .html; The APA Committee on Psychological Tests an d Asse ssment (CPTA) (1994).Statement on the Use of Secure Psychological Tests in the Education of Graduate and Undergraduate Psychology Students.http://www .apa.org/science/ttrr.html; The APA Test Taker Rights a nd Resp onsibilities W orking Group of the Joint Committee o nTesting Practices (1998). The Rights and Resp onsibilities of Test Takers: Guidelines and E xpectations.http://www .aera.net/?id=378; AERA (2000). Position State ment on High-Stakes Tes ting in Pre-K12 Education. Adopte d July2000.http://www .intes tcom.org/guidelines/index.php; Guidelines of the ITC (ITC Guidelines on Adapting Tests , ITC Guidelines o nTest Use , CBT & Internet Guidelines)

http://www .psychtesting.org.uk/downloadfile.cfm?file_uuid=64877B7B-CF1C-D577-971D-425278FA08CC&ext=pdf; BritishPsychological Society Psychological Testing Cen tre (2002). Guidelines for the Development and Use of Computer-basedAsses sments. Leiceste r: British Psychological Society.http://www .sipsych.org/english/Universal%20Declaration%20as%20ADOPTED%20by%20IUPsyS%20&%20IAAP%20July%202008.pGauth ier, J. (2008 ). Universal declara tion of ethical principles for psychologis ts. In J. E. Hall, & E.M. Altmaier (Eds .), Globalpromise: Qua lity assurance a nd accountability in professiona l psychology (pp. 98-105). New York: Oxford University Press .


13/13


Questions for classroom discussion

1. Should we follow local or internationa l documents prescribing ethical principles? Which shou ld be followed in w hat situa tionsand why?2. Can a test w ith known reliability problems be use d ethically? If yes, under what circumstances?.3. Which would you choose: a more fair test or a more valid one? Which one of these two principles should have precedence:fairness in testing or validity and the responsibility towards the client to achieve the best possible result?

4. The feedback organizations give to job applicants after the selection process is usually quite vague. Do you see any ethicalproblems he re?5. In your opinion, has the profess ion of I/O psychology aban done d the psychological heritage o f resp onsibility and service tosociety?6. When the needs and w ishes of the client who commissioned the testing project and those of the testee are different,whom should the psychologist follow? Is it possible to accommodate both?7. Regarding the conflict betwee n adhering to professional standa rds and yielding to client demands: Should a profess ionalalwa ys conform to professiona l stan dards? Under wha t circumstances is it acceptable to lower standa rds in order toaccommodate client demands? Under wha t circumstances is it not accepta ble to lower standa rds?8. Under what circumstances do you think that the situation/context will take precedence in governing an action asethical/unethical, over a se t of values?9. Do you think ethical codes are sufficient or do we need laws to govern the activities of testing professionals?

Author information

Dragos Iliescu (drago s.iliescu@testcentra l.ro) holds a PhD in I/O psychology from the Babe s-Bolyai University, Cluj-Napoca,Romania. He is Associate Professo r at the National School for Political and Administrative Studies (SNSPA) in Bucharest, andManaging Partner of D&D/Testcentral, the major test publisher in Romania. He has worked for 12 years as a consultant in thefield of business re sea rch, in areas like marketing resea rch, branding and HR, for Romanian a nd internationa l clients .

Dan Ispa s ([email protected]) is a do ctoral candidate in I/O psychology at th e University of South Florida, Tampa, Florida,USA. He holds a M.A. in I/O psychology from the s ame university. His rese arch interests include organiza tional interventions,counteproductive work behaviors, personality and affect in the workplace, and test development and validation. His researchwas presented at the Annual Conferences of the Society for Industrial and Organizational Psychology and the Academy of Management and was published in Human Resource Management Review, Industrial and Organizational Psychology:

Perspectives on Science and Practice, Psihologia Resurselor Umane and The Industrial-Organizational Psychologist.

Michael M. Harris (d. 2009) was the Thomas Jefferson Professor of Management in the College of Business Administration atthe Un iversity of Misso uri-St. Louis. He had a P h.D. in I/O psychology from the University of Illinois-Chicago. Mos t of this workrevolved around selection and hiring practices and compensation systems, with a focus on staffing/selection, compensation,and performance management, both in the domestic context and the international context. Michael published numerous peer-reviewed articles in the area of human resource management and edited several books, including the Handbook of Researchin International Human Resource Management (Lawrence Erlbaum, 2007). He served as a keynote speaker at theInternational Test Commission's conference in England, June, 2002, where he made a presentation entitled: "Patrolling theInformation Highwa y: Creating and Maintaining a Sa fe, Legal, and Fair Environment for Test-takers."

ULASAN JURNAL (Lampiran)- By Haziq Azree

Documents

Transcript of ULASAN JURNAL (Lampiran)- By Haziq Azree