An overview of current knowledge about the impacts of forest management certification A proposed framework for its evaluation

The often-claimed environmental and social benefits of forest certification remain to be empirically evaluated. Despite numerous publications on the impacts of tropical forest certification, virtually all are based on secondary sources of information and not on field-based measurements. This paper proposes an empirical research framework for a carefully designed field-based evaluation of the ecological, social, economic, and political impacts of tropical forest management certification taking into account location-specific contextual factors which shape certification outcomes. The paper also suggests that solid methodological quantitative and qualitative approaches be used to build proper counterfactuals on which to base the comparisons for inferring impacts, all informed by a thorough theory-of-change and through processes that bring stakeholders together. The proposed research framework represents a first step towards the design and future implementation of evaluation research in the context of tropical forest certification on a global basis. It is hoped the research framework proposed contributes to learning from past mistakes, building on lessons learned and enhancing decision-making towards the maintenance of forest values over the long term, and for the benefit of society as a whole. (Resume d'auteur)


Abbreviations v
F orest management certification is a marketbased mechanism to promote sustainable use of forest resources.It recognizes responsible management through independently verified compliance with a set of underlying principles, criteria and indicators that delineate the ecological, social, economic and policy impacts resulting from forest management for specific objectives.As such, a credible label of certification makes the positive externalities of proper forest management visible to the public (Roberts 2012).The emergence of certification in the late 1980s was motivated by failures of other efforts to halt deforestation and improve forest management.While the launch of the 'Forest Principles' at the United Nations Rio Summit in 1992 recognized the urgency to manage forests to meet the needs of present and future generations, the global community could not come up with a legally binding agreement to halt forest loss, particularly in the tropics.Alarm about tropical forest destruction and concerns about the unintended consequences of boycotts of forest products inspired diverse stakeholders to collaborate on an initiative based on the concept of certification of forest management (Viana et al. 1996;Elliott 2000;Cashore et al. 2004;Auld et al. 2008;Cashore and Auld 2012;SCR 2012).
As is the case with many forest management and conservation interventions (e.g., payments for environmental services or establishment of protected areas), there is insufficient empirical evidence on the impacts of certification to generate lessons learned at the global scale.While several published reviews of forest management certification provide some guidance for future work, most were based on geographically limited case studies, indirect information or anecdotal observations and were not conducted by independent observers.Many forest stakeholders now agree on the need to critically assess when, where, how, to what extent, why, at what cost to whom and for how long certification changed the ways forests are managed.In support of such assessments, this paper provides a critical description of known impacts of certification on the fates of natural forests in developing countries, along with a brief review of the literature on the evaluation of conservation interventions.Based on this background, the paper then proposes a roadmap towards the design of a formal evaluation of the empirical biophysical, social, economic and policy impacts of the Forest Stewardship Council (FSC) certification of timber extraction in natural tropical forests (Box 1).The paper also stresses that for a proper evaluation it is critical to understand the national and local contexts (social, political, biophysical, economic) that affect the implementation and ultimately permanence of certification's impacts in a given forest.Much of the information needed to operationalize the roadmap presented here is considered essential to this end.
In this paper, 'certification impacts' refer to those changes in the forest itself and surrounding areas that are attributable to certification at several levels: neighboring local communities and workers; participating forest management units (FMUs), which are forests managed for timber production by private forest owners, concessionaires, industrial groups and states legally recognized by corresponding authorities; and, local and national governments and legal frameworks.Although assessing impacts along the forest product market chain (e.g., chain-ofcustody auditing process) is important, the paper focuses on the forests and associated institutions and agents.It applies the definition of impacts of the Organization for Economic Co-operation and Development (OECD): 'the positive and negative, primary and secondary long-term effects produced by a development intervention, directly or indirectly, intended or unintended' (OECD 2002).Further, the use of the term 'sustainability' in this document refers to one of the goals of forest management certification; the more restricted | 3 term 'responsible' refers to what is actually certified.The question of sustainability of forest management has been the focus of governmental and inter-governmental processes that have put forth collections of criteria and indicators to promote or purport to guarantee sustainable forest management; these include the Montreal Process, a framework that defines indicators associated with sustainable forest management in non-tropical countries (Prabhu et al. 1998;Washburn and Block 2001;ITTO 2005;Pintér et al. 2012).The rationale behind certification standards certainly builds upon, and has learned from, these debates.

Diversity of forest management certification schemes
Numerous certification schemes operate at national levels in several countries.The two largest international certification bodies are the Programme for the Endorsement of Certification (PEFC) and the Forest Stewardship Council (FSC; Box 1).Created in 1993, the FSC operates through broadly discussed and agreed principles, criteria and indicators and serves as the oversight and certification-granting organization to which accredited certifying bodies (CBs) report on the implementation and compliance of its standards.FSC has worked with other forest management certification schemes such as Lembaga Ekolabel Indonesia (LEI), a non-profit multi-stakeholder organization that promotes just and sustainable forest management.The other large certification scheme -PEFC (formerly the Pan-European Forest Certification) works mostly in Europe and the USA and in a very different manner than the FSC.Most fundamentally, the PEFC has no principles and criteria of its own, but instead recognizes the activities of autonomous certification schemes.In tropical countries, these schemes include the Brazilian Forest Certification Program (CERFLOR); the Malaysian Timber Certification Council (MTCC); and the Pan-African Forest Certification (PAFC) in Gabon.In other words, PEFC provides another layer of assurance that the standards of approved certification bodies are met (SCR 2012).
To render FSC principles and criteria more operational and locally appropriate, most countries and regions have, or are developing, detailed sets of standards, indicators and verifiers against which representatives of certifying bodies audit (Table 1).To be eligible for FSC approval, these national and regional standards must be developed in a participatory manner according to specified procedures.In the absence of detailed standards, CBs can propose for approval by the FSC their own sets of indicators based on FSC Principles and Criteria (FSC 2013) and, on some occasions, develop their own generic regional standards (e.g., Rainforest Alliance for Indonesia).The FSC is the largest certification scheme in the tropics (Figure 1).While this paper focuses on the evaluation of FSC certification, its approach could also apply to PEFC.
Numerous studies have compared forest certification schemes, particularly on the basis of the stringency of their standards, modes of operation and constituencies (Vogt et al. 2000;Cashore 2002;Holvoet and Muys 2003;Oliver 2004;Fischer et al. 2005; WWF/World Bank Global Forest Alliance 2006; Auld et al. 2008;McDermott et al. 2008;Tikina and Innes 2008;Overdevest 2010;Clark and Kozar 2011;Johansson and Gun 2011).Another study compared the standards of the FSC and the North American-based Sustainable Forestry Initiative (SFI) with the Montreal Process.Like certification, the Montreal Process aims to promote sustainable forest management, but it focuses at the national level and takes a descriptive approach.In contrast, FSC's forest management certification operates at the level of discrete or tightly linked FMUs in a prescriptive manner, with outcomes required (Washburn and Block 2001).
| 5 The proliferation of certification schemes is understandable given the broad scope of potential applications (national, regional or international), the diverse cultures of forest industries and their associated governmental agencies, and other biophysical and institutional differences in the places where these schemes have emerged.These certification schemes vary fundamentally in the involvement of stakeholders in defining their standards and in the degree to which their requirements exceed those set by national laws.They also vary in the breadth of their requirements.
Overall, most of the comparative studies mentioned above concluded that the FSC is the strictest and most complete because it explicitly addresses sustainability related to: • political issues (e.g., respect of applicable laws, FSC Principle 1); • environmental issues (e.g., environmental impact, maintenance of high conservation value forests, FSC Principles 6 and 9); • social issues (e.g., workers' rights and employment conditions; tenure, use rights and responsibilities; indigenous peoples' rights; community relations and benefits from the forests, FSC Principles 2, 3, 4 and 5); and • economic issues (e.g., benefits from the forest, management plan, monitoring and assessment, FSC Principles 5, 7 and 8).
In addition, as Clark and Kozar (2011) and others point out, most of the other certification schemes focus only on parts of the production chain (e.g., legal, biophysical or social issues).It is also worth mentioning that proliferation of schemes and claims of sustainability of forest management remain a concern insofar as they can confuse consumers and jeopardize the credibility of certification (Putz 2004;Fischer et al. 2005 Since 'equity' is a contested concept, the author explicitly discusses the goals and outcomes of equity as it pertains to these schemes.One finding is that stakeholders' level of trust in certifiers affects the certification process, and as a consequence, equity outcomes.For the FSC, the author reports that despite efforts to secure stakeholder participation in discussions of standards (e.g., empowering through engagement at several scales), equitable distribution of costs and benefits of certification were not addressed (i.e., stakeholder engagement does not guarantee equitable cost and benefits distribution).

Synergies with other policies related to forest governance
Forest certification does not act in a vacuum.It is implemented in particular social, institutional and political contexts that further influence decisions regarding forest use.One example is decentralization of control over resource management (e.g., Pacheco 2004;Agrawal 2007;Agrawal et al. 2008;Bowler et al. 2010;Brooks et al. 2012).Insights are also needed about how the impacts of certification compare with other interventions with shared goals including, most prominently, governmental regulations (Lee and Norris 2012).In addition, private forest companies that seek certification operate in countries that may be affected by various recent national and international efforts to assure the legality of forest products.These efforts could include, most prominently, the Forest Law Enforcement, Governance and Trade (FLEGT) action plan by the European Union and its Voluntary Partnership Agreements (www.eufl.rgt.int), as well as the amended Lacey Act (2008) in the USA (www.forestlegality.org).Although both of these initiatives focus on the legality of timber trade, they ultimately seek to foster better governance along the entire production chain -from harvesting to consumption.Synergies between efforts to assure legality and forest certification schemes may need to be explored to avoid redundancies (Carlsen et al. 2012;Tind 2012) and to otherwise increase their mutual effectiveness and efficiency (Vogel 2008).Furthermore, linkages between forest management certification and forest conservation interventions based on payments for environmental services (e.g., water, biodiversity and/or carbon) could enhance the effectiveness of certification and address some critical market failures in the timber trade (Hyde 2012).

A
s stated above, the need for a critical evaluation of the empirical impacts of forest certification has gone unfulfilled to date.Such impacts include: changes in the forest itself and surrounding areas; at the level of neighboring local communities and workers; for participating FMUs; and, to local and influenced institutions attributable to FSC certification.Among the many possible reasons for the lack of a critical evaluation are: the assumption that certification is inherently environmentally, economically, politically and socially beneficial; the cost of such a study; the fact that in some regions (e.g., the Congo Basin) certification only recently became important to forest management decisions; and, the inherent methodological and logistical challenges in evaluating the potential direct and indirect impacts of such a complex intervention in a wide variety of forests under a diverse set of ecological, socio-economic and political conditions.Additional constraints derive from the long-term scope of most conservation interventions and their often vague objectives (e.g., biodiversity conservation, maintenance of ecosystem integrity, sustainability or social well-being).This vagueness makes it difficult to identify specific elements to assess.Finally, in addition to lack of funding and incentives for evaluation is the lack of personnel trained in the rapidly evolving field of environmental impact evaluation (Ferraro 2009;Mickwitz and Birnbaum 2009).Lack of proper evaluation of the impacts of forest certification heightens the risk of both poor downward and upward accountability, including the process of certification itself (Rogers 2012).
It is also important to determine the extent to which certification delivers on its promises of maintaining forest values in order to inform its supporters if they are 'getting their money's worth' from the intervention.That is, understanding the factors and conditions that lead to impacts can help stakeholders compare the outcomes of certification to those returns from possible conservation investments (e.g., supporting green markets as opposed to actively contributing to protected area management or other conservation alternatives).Further along the certification pathway, the tangible, less tangible and dynamic direct and indirect costs and benefits to FMUs of getting and remaining certified must be understood.In other words, the impacts of certification on the managed forests themselves and in neighboring areas need to be formally evaluated.
As an illustration of the consequences of the current lack of a proper evaluation of certification, imagine that certification of a particular FMU was associated with achievement of the biophysical objective of protecting riparian buffer zones.Unfortunately, buffer zone protection resulted in a conflict with a local community whose traditional uses of natural resources in these same areas were precluded.Awareness of such tradeoffs and other sorts of contested situations is clearly a first step towards their solution.It sets in motion deliberative processes and agreements involving all relevant parties, enforcement mechanisms, sanctions for lack of compliance and monitoring and verification activities performed in ways conducive to joint decision-making, cost minimization and benefit-sharing.Applying an evaluation lens at the onset of designing an intervention might help prevent such conflicts and, further down the road, facilitate identification of unexpected outcomes and suggest ways to deal with their occurrence.
In this paper we address two main goals of implementing a formal evaluation of forest certification: (i) to assess the extent to which certification is the direct and indirect driver of observed changes in the outcomes of forest management (i.e., the likely or achieved shortand medium-term effects of an intervention; OECD 2002); and (ii) to determine how other interventions and processes contribute to particular outcomes related to certification.As a whole, these hoped-for outcomes include maintenance or enhancement of forest values (e.g., biodiversity, Evaluation in the context of forest certification ecosystem service provision); social welfare of forest owners, workers and local people (e.g., health and education, access to credit, increased assets); the financial and legal status of certified FMUs; and changes in policy frameworks.To distinguish the impacts of certification from those related to other contextual factors and interventions [that is, to distinguish the impacts given in (i) and (ii) earlier in this paragraph], the causal pathways that led to any observed changes need to be understood.In short, the key factors that affect decisions driving the certification-related changes need to be identified (SCR 2012).
To identify the causal pathways and to understand how they operate requires a formal theory-ofchange of the certification intervention (a theoryof-change is a model that describes how the process of change in itself occurs).For example, management decisions at the FMU level are made based on opportunities related to silvicultural knowledge, traditions, technical and financial capacities, market information, timber yields, inputs from social actors and governmental policies and regulations, among other factors.At the same time, decisions are constrained, or at least modulated by a range of factors including lack of technical capacity, as well as policy, institutional and market failures.The interplay of enabling and constraining factors clearly influences management choices and the resulting outcomes in the forest and further afield.That said, evaluations should be grounded in active participation by all relevant stakeholders to give jointly created knowledge credibility (i.e., true and technically adequate for handling evidence), salience (i.e., relevance and value to decisionmakers and other evaluation users) and legitimacy (i.e., fairness of knowledge gathering, unbiased and respectful; Mollinga 2010; Rowe 2012).It should also be based on the shared belief that an entity's actions are desirable, proper and appropriate, based on institutionalized norms and practices that confer authority (Bernstein 2004).This emerging knowledge evolves as participants in the evaluation learn and as contexts change.Thus, the tools and processes of evaluation need to accommodate the new knowledge generated by participants and reflect new benchmarks in the collection of evidence of impacts (Mickwitz and Birnmbaum 2009;Rogers 2009;van Stolk et al. 2011).Moreover, insufficient participation of appropriate stakeholders in the evaluation process hinders social learning and it may lead to equivocal allocation of costs and benefits.In the absence of a mechanism for learning from past mistakes, other forest conservation interventions and the process of certification itself may suffer.In the particular case of certification, lack of a systematic assessment deprives stakeholders -from local social actors and resource managers to policy-makers and donors -of information to assess the tradeoffs among the different domains of forest management sustainability.Likewise, it denies them the knowledge needed to make evidence-based -and thus more informed -decisions.
To retain its utility, an evaluation of certification should also capture any changes that affect its impacts over time.Such changes could include reduced added-value of the certification intervention due to improved skills of both forest managers and administrators (e.g., harvesting costs diminished for the FMU due to streamlined forest operations and increased efficiency during timber harvesting).Furthermore, impacts that come from contextual changes beyond the certification intervention itself need to be addressed (e.g., low current additionality: early profits from certification might diminish over time with improved legal frameworks and enforcement, which would reduce the need for certification to maintain forest values).

Background on the evaluation of conservation interventions
Programme evaluation is a well-developed field, particularly in regards to assessing the impacts of public policies (e.g., welfare benefits such as improved health and education) and social and development interventions (e.g., conditional cash transfers).In contrast, evaluation of conservation interventions, including certification, remains a contested field of enquiry (Cook et al. 2010) and lags behind despite a recent flurry of publications (see later in this section).Evaluations seek to understand systems and processes through the generation of knowledge situated between research, policy and practice.As already mentioned, a main question is how an intervention changes specific variables (attribution) that produce particular outcomes.It is also important to discern how the ecological, social, economic and policy contexts in which an intervention is embedded influence its outcomes (contribution).That is, direct linear responses to certification and other complex interventions are unlikely due to the multitude of natural and human-induced processes acting at different scales in both time and space (Levin et al. 2013).In these complex social-ecological systems, the multiple causal mechanisms that operate simultaneously are context dependent and prone to unpredictable feedback loops that give rise to emergent outcomes (i.e., recursive causality; Rogers 2008).
The indirect positive or negative effects of conservation interventions are particularly hard to capture.As one spillover effect, for example, an FMU might employ good management practices because it hired a worker formerly trained in a certified FMU rather than as a direct response to certification requirements.Nevertheless, these spillover effects need to be understood to assess the overall efficiency and effectiveness of conservation interventions such as certification (Nussbaum and Simula 2004).In the scenario above, evaluation of certification would require determining if certified FMUs are the only ones doing proper management because they have hired all the high-quality available contractors (i.e., leakage).Likewise, evaluation processes should help establish if governments have focused on enforcement of uncertified FMUs because they believe that the tracking of compliance with legal requirements already takes place in certified units.
Several approaches have been used to evaluate conservation interventions.They include robust experimental methodologies through randomized control trials; more qualitative and less informative analyses on the causal effect of the intervention in the outcomes; and more qualitative nonexperimental methods (Bamberger and White 2007;GAO 2009; Table 2).These approaches also include the now popular systematic reviews, for which studies with robust results are still much needed (Pullin et al. 2009;Bowler et al. 2010).

Positive self-selection
As with any voluntary scheme, participants selfselect into certification.The resulting 'positive

Experimental
Randomly selected FMUs were allocated to the forest certification intervention.
Selection bias is likely because certification is voluntary.A comparison based on the experimental approach is not feasible.

Quasi-experimental
Because the certification treatment was not randomly allocated, a comparison group of uncertified FMUs needs to be constructed (counterfactual).The treatment and control groups should only differ in their certification status.
Comparison group construction is data intensive and technically difficult.Approaches include matching techniques (e.g., groups of certified and non-certified FMUs matched by factors that influence certification outcomes) and instrumental variables (e.g., correlated and easierto-assess variables are used to infer impacts), among others.

Before-after
Baseline data on key outcomes related to the certification intervention are measured and compared with data corresponding to the post-certification condition.
Data are often not available for all the variables before certification was granted for both treatment (i.e., certified) and control groups.

Systematic review
Intensive analyses of certified FMUs, drawing on the history of the FMU and how the particular nature of the mechanisms and contextual factors produced change.
Time-consuming and knowledge-demanding method: requires robust results of properly designed studies and thus fails to determine the integrated impacts of forest management certification unless available literature exists.

Expert judgment
Assess the impacts of certification through compilation and synthesis of statements of people with profound knowledge of certification and the contexts in which forest management occurs.
Because forest management certification is complex, this approach can be informative but may fail to capture the integrated effect of certification-driven changes and interactions with contextual factors.
Source: adapted from Romero and Castrén (2013) Note: Table is based on the framework proposed by GAO (2009).
selection bias' can obscure insights that could be derived from random allocation of FMUs to the treatment (certified) and control (uncertified) groups, as in randomized experiments.Selection bias renders it more difficult to separate the effects of the certification intervention from the direct and indirect influences of other contextual factors; it also makes it more challenging to identify the extent to which the intervention caused any observed impacts (i.e., internal validity; Chen et al. 2011a).However, few studies of the impacts of conservation interventions adequately addressed selection bias.This oversight often resulted in exaggeration of the direct positive impacts of the interventions (Ferraro and Pattanayak 2006;Pattanayak et al., 2010).These deficiencies have recently started to be addressed; several practical examples of ways to avoid them are now available for the conservation arena (Andam et al. 2008;Sims 2010;Ferraro et al. 2011;Nelson and Chomitz 2011;Alix-Garcia et al. 2012;Arriagada et al. 2012;Laufer et al. 2013).

A mixed bag of methodological approaches
Because the certification and other conservation interventions are typically not allocated at random, quasi-experimental approaches are often needed to ensure external validity (i.e., how the outcomes from the intervention could be generalized to fit other contexts; Chen et al. 2011a).Therefore, to gauge the intervention's impact, a counterfactual (i.e., the condition of those affected if the intervention had not occurred) needs to be constructed or selected (Greenstone and Gayer 2007;Ferraro 2009;Jagger et al. 2010) In summary, evaluating the impacts of forest certification requires a range of quantitative methods to reveal connections along the causal change model.These quantitative methods need to be complemented with qualitative approaches to detect the indirect effects of the intervention (i.e., a mixed-methods approach; Garbarino and Holland 2009;White 2009;Rugh et al. 2010;Bamberger 2012;Stern et al. 2012;Ton 2012).

Previous assessments of forest certification impacts
Forest certification has received considerable attention from a variety of groups concerned about the fates of the world's remaining forests.
With few exceptions, previous attempts to assess the impacts of certification have focused on examining secondary information and stakeholder perceptions.Although none of the approaches reviewed below resemble what could be considered a well-designed evaluation, they nevertheless provide useful information and insights.

Interviews about the impacts of forest management certification
An indirect way to assess the impacts of certification is by soliciting impressions from representatives of FMUs, timber industries, local communities, government officials, environmental groups and buyers.Global and regional studies of this sort all reported that the compilers solicited impressions of both supporters and critics of forest certification but nevertheless reported mostly positive impressions of impacts (Frost et al. 2003;Hartsfield and Ostermeier 2003;Humphries and Kainer 2006;Ebeling and Yasué 2009;Sheil et al. 2010;Zagt et al. 2010).These studies all conclude that certification has done more to improve tropical forestry than any other global initiative (e.g., the Tropical Forestry Action Plan, ITTO Objective 2000).They also agree on the need for empirical assessments of certification impacts.

Literature reviews, systematic reviews and methodological constraints
Literature reviews on the impacts of forest certification have mostly focused on particular issues (e.g., biodiversity, wildlife, local communities) and regions.Typically, the reviews assessed one or a few specific aspects of management such as the protection of riparian buffer zones, seed-tree retention and deforestation.Among the diversity of approaches employed to determine changes in forest management due to certification, some studies employed beforeand-after certification comparisons, while others compared certified and non-certified FMUs (e.g To assess the current status of evaluation of certification, Blackman and Rivera (2010) analysed 134 documents on certification of timber, fish, bananas, coffee and general agricultural practices.Of these studies, only 14 employed designs that appropriately considered the confounding effects of selection bias.Of these 14, the sole study on forest management compared a certified and a nearby uncertified community forest operation in Brazil and reported that certification had small positive environmental and socio-economic impacts.
Selection bias in this study was at least partially avoided because the communities had similar land tenure arrangements and both ran their own operations for the primary purpose of timber production (Barbosa de Lima et al. 2008).In a recent update of their 2010 study, Blackman and Rivera (2011) ranked 46 peer-reviewed studies on the impacts of certification on the basis of the robustness of their research designs.Although none of these studies was on timber certification, the results are nevertheless of interest.In 7 of the 11 best-designed studies, certification was not associated with increased environmental, social or economic benefits for the producers.The exceptions were for bananas, for which productivity increases led to on-farm investments; for coffee, for which the social and economic benefits of certification were due to particular conditions and did not translate into gains in education or health; and for tourism, for which certified operations received a price premium.
Unfortunately, due to insufficient data on the individual practices that led to certification, formal meta-analyses were not feasible.It must be noted that although meta-analyses can be useful, they do not replace well-designed evaluations that can result in causal inferences about the contributions of the intervention to the maintenance of forest values and identify unintended and indirect impacts.
A more nuanced analytical approach to evaluating the impacts of certification that was recently tested is based on studies on compliance with best-management practices for forestry (BMPs; Newsom et al. 2012).The authors reviewed what they judged to be properly designed and well-executed studies on the impacts of set-asides, including riparian buffer zones, on species diversity, population viability and the quality of water, air and soil.The authors reasoned that in the absence of properly designed evaluations of certification, adherence to BMPs that are also required for certification could provide insights into certification impacts.They contended that disaggregation of the activities required to achieve a desired outcome can aid in the assessment of certification impacts.For example, the observation that establishment of riparian buffer zones benefits sensitive taxa will inform hypotheses that address the underlying causal mechanisms.Disaggregation, also recommended by researchers for other evaluation challenges (Bamberger et al. 2009;Rogers 2009;Jagger et al. 2010), provides a basis for the evaluation of the empirical impacts of certification (Crosse et al. 2012).Unfortunately, disaggregation can also lead to disregard of the integrated and interacting range of impacts of the certification intervention per se.As such, this approach will miss the indirect effects and unintended impacts of certification because it is not based on a detailed theory-of-change.
Recently, the Steering Committee Report of the State-of-Knowledge Assessment of Standards and Certification thoroughly assessed the impacts of standards and certification for achievement of sustainability goals for agriculture, fisheries, forestry and aquaculture (SCR 2012).This effort was based on literature reviews, meetings with business leaders, interviews with key stakeholders and analyses of case studies.Through this lengthy and reflective process, the authors analyzed the contexts, actors, impacts, pathways and trends that will likely affect standards and certification systems in coming years.The resulting document is too rich to summarize in detail here, but it provides evidence of improvements in environmental, social and economic practices associated with, but not clearly attributable to, certification.In addition, the assessment reveals some unintended negative effects on a case-by-case basis.At larger scales in both space and time, the assessment concludes that it is hard to infer impacts of certification.Furthermore, the variety of methods and methodological limitations preclude identification of the causal effects of the certification intervention.The report goes so far as to suggest that the indirect impacts of certification might be of more consequence than direct ones.And its main conclusion is that the incentives available from certification will not be sufficient for FMUs that employ far-from-acceptable management practices.The report also argues for the need to develop a robust evaluation of empirical impacts (i.e., from management practice to outcome) to properly assess the suitability of this private, voluntary, market-oriented policy to promote sustainability.

Review of corrective action requests (CARs)
Several A s mentioned in the previous sections, evaluations of the impacts of certification need to consider differences among FMUs in biophysical, socio-economic and policy characteristics that affect how they are, and should be, managed.For example, forests vary in stocking of commercial species, terrain, accessibility, seasonality and underlying natural dynamics.Social aspects vary with the characteristics of communities living within the FMU's area of influence and their relationships with it (e.g., employment possibilities and freedom of access), including the understanding of community residents about the FMU and its operations.Governance aspects that influence forest management include tenure types and rules governing resource access, policies and regulations that define allowable harvests, required management procedures and fees (e.g., taxes and royalties), participation of local stakeholders and the extent of regulation enforcement (Coleman and Steed 2009;Burgess et al. 2012).Finally, characteristics of the economic sphere of influence on certification include FMU type (e.g., public, communal or private), firm size, markets accessed (e.g., international vs. local), type of harvesting arrangement (e.g., subcontracted or not), technology employed and features of the commercialization process (i.e., market chains and the extent of vertical integration; Amacher et al. 2009;Assunçao et al. 2012).An effective evaluation strategy would have to consider this complexity and address associated methodological challenges.The proposed approach presented below is grounded on the intellectual foundations of evaluators experienced with complex interventions (e.g., P. Rogers, M. Bamberger, H. White, and M.Q.Patton); institutions that support debates on these issues (e.g., Independent Evaluation Group (IEG), Department for International Development (DFD), International Initiative for Impact Evaluation (3ie), Campbell Collaboration; Hivos E-Dialogues, betterevaluation.org); and several researchers (e.g., Jagger et al. 2010;Guijt et al. 2011;Wigboldus and Brouwers 2011; also see references in Section 2).It also sits firmly on the foundation for adaptive management of natural resources formulated by C. S. Holling and his colleagues (Holling 1978), which is fundamental for the sustainable use of natural resources through continued learning and experimentation.

A proposed roadmap
Proposed key activities towards a well-informed evaluation of the empirical impacts of forest certification are outlined here.The first four activities are discussed in more detail below based on the review sections above.Ideas relevant to deliberative processes used to design the evaluation are also assembled.Activities 5 and beyond correspond to implementation of impact and process evaluations and then progress on to the activities that commence once the evaluation is concluded.Given current knowledge about the impacts of certification, it is not possible to discuss these activities in detail.Instead, in Section 4 key information is provided for the design phase that precedes the field-based evaluations.The steps are as follows: 1. Clarify the values that underpin the evaluation -what are the desired and undesired processes, impacts and distributions of costs and benefits for different types of stakeholders?2. Define the scope or boundaries of the evaluation using a systems approach (Fujita 2010).

Design parallel processes of:
a. theory-based impact evaluation of the intervention (henceforth 'impact evaluation').b. evaluation of the implementation process of the intervention (henceforth 'process evaluation').4. Identify initial questions to be addressed by the evaluation, continue to refine them and add new ones.5. Implement both impact and process evaluations: measure impacts and test  Platt 1964).6. Elucidate whether the intervention caused the observed impacts (attribution and contribution analyses).7. Assess threats to the validity of the evaluation.8. Synthesize evidence for the impacts.9. Support the use of the new knowledge gained through the evaluation.
Findings of the impact evaluation need to be built through cross-disciplinary integration of evidence corresponding to the different domains that underlie responsible forest management.As with the results of the process evaluation (Steps 5-7), the findings will represent diverse forms of evidence that need to be integrated to produce an evaluation judgment that participants in the evaluation process will discuss further.Through the iterative processes of deliberation and discussion, synthesis documents will report findings to a wide range of audiences and communicate with those within and outside the certification intervention in a transparent way (Step 8).In the spirit of enhanced capacity for change, activities that support the use of the knowledge generated to influence forest management and certification should be set into motion early in the evaluation process (Step 9).These actions might require new institutional arrangements and policies (e.g., partnership agreements, legal frameworks for monitoring and verification, refined and adapted interventions).The cycle of adaptive management will make one full turn when new experiments on resource management, including tests of novel rules and associated implementation models, are initiated (Figure 2).

Clarify the values underpinning the evaluation of certification
Evaluation involves assessment of the impacts of an intervention based on the values and aspirations of a group of interested parties.Consequently, intended users of the results of an evaluation need to be included in the evaluation's design and then consulted regularly during its implementation.Their involvement will also provide insights into the kinds of information needed to satisfy their needs; it might also help identify how to obtain that information.Overall, this process will generally render the evaluation more engaging, dynamic and transparent.For the evaluation of certification, at least the target audiences listed below should be included: -Donors who want to know if their investments have served the intended purpose and if the theory behind certification is robust (or was more robust in the past).Donors will likely use knowledge gained as a guide to redirect their investments.
-Government agents who want to gauge how certification works in relation to current and future policies related to forest management (e.g., redundancy and complementarity with existing legal frameworks with similar goals such as national regulations and international efforts at legality assurance).-Certifying bodies that want to learn how their work contributes to forest management sustainability and to become aware of deficiencies in their work.-NGOs that support certification but recognise the need for objective and independent measures of its contributions to achievement of their own goals.-FMUs whose managers are interested in learning how their operations and decisions related to certification fit into the larger picture of forest management sustainability.-Other stakeholders, especially people who live near FMUs or whose welfare is otherwise affected by forestry operations, and who should benefit from the improved management of certified FMUs, as well as society at large, including consumers who purchase certified goods.

Define the scope of the evaluation
Although this paper focuses on the evaluation of natural forest management in the tropics, similar work is needed in temperate forests and for planted forests.Some of the impact questions pertain mostly to the FMU level, whereas others need to be addressed at larger spatial scales through remote-sensing and other techniques.To define the scope or set the intellectual and geographical boundaries of the evaluation, deliberative processes involving key stakeholders should focus the evaluation efforts on the most important questions.

Impact and process evaluations
Detailed understanding of two different processes is needed to enable a proper evaluation of certification: evaluation of the theory behind the intervention through empirical examination of outcomes (theory-based impact evaluation) and process evaluation (Figure 2; Box 2).Theorybased impact evaluation explores whether impacts are achieved due to the intervention (i.e., if the required practices adopted by managers and audited by certifying bodies actually maintain forest values; SCR 2012; Stern et al. 2012) and to establish whether failure to achieve goals occurred even when the intervention was properly implemented.In other words, theorybased impact evaluation will reveal whether adequate compliance with certification standards secures achievement of intended outcomes.For this purpose, detailed measures of outcomes are needed (e.g., reduced deforestation or the maintenance of timber yields; see Table 3 for a non-exhaustive list; this process corresponds to the dark grey portion in Box 2).Theory-based impact evaluation would require a theory-ofchange for forest certification and identification of its underlying risks and assumptions; this tool would propose impact pathways through which change is achieved on the ground.

Notes on a theory-of-change for FSC certification
Development of a theory-of-change to guide the evaluation of certification requires synthesis of existing knowledge and a priori examination of the evaluation questions to be addressed.A theoryof-change is also the product of critical thinking about how and why changes occur, while clarifying expected outcomes and pathways through which change happens (Retolaza 2011;Guijt and Retolaza 2012;Stein and Valters 2012;SCR 2012).A key step towards building a theory-ofchange is to assure that the process is participatory (Rogers 2012;Vogel 2012).An iterative approach is also needed so as to make as explicit as possible the intended and realized impacts of the certification intervention (James 2011).

Box 2. FSC certification and evaluation approaches
the logic behind an FMU's decision to manage rather than simply exploit a forest for timber is the first link of the model of change that can lead to FSC certification.Even if an FMU prepares a management plan (MP), it might decide to disregard its plan and carry out logging operations in the business-as-usual (BAU) manner.Alternatively, if the MP is of good quality and the FMU follows it, many of the goals of certification might be reached without a certificate being granted or even sought (dashed black line).Another possibility is that the FMU contracts a certifying body (CB) to make a pre-assessment visit but then realizes the costs of required changes would exceed the expected benefits from certification and abandons further efforts to become certified (dotted line).If the FMU does pursue certification, the next step is to implement on-theground actions to comply with certification standards.To become and remain certified, these improved practices are verified through the auditing process by a CB on an annual basis (process evaluation).After being certified, the FMU might allow its certificate to lapse or have it rejected due to failure to address CARs.FMUs that remain certified must still demonstrate their management practices lead to desired outcomes (theory-based impact evaluation).The FMU can then revisit the choice of forest management regime based on a suite of factors and decide to apply for certification, return to BAU management practices or maintain improved management practices.There is compliance of legal frameworks and procedures to avoid corruption.

FMUs
Workers' housing meets minimum national standards.
Required taxes, fees and royalties are paid in a timely manner.
There is fair benefit-sharing with local communities.

Outputs
Hydrological functions and services (e.g., flow regimes and water quality) are maintained.
Worker welfare is enhanced.
FMU is financially viable to its owner (e.g., government).
Legal/institutional frameworks are effective, efficient and equitable.
Biodiversity is maintained at the genetic, population and landscape levels.
Livelihoods of surrounding communities are enhanced.
Operations in the FMUs guarantee the compatibility of forest resource management with ecosystem service provision.
There is upward accountability a .
Productive capacity is not impaired and future harvest volumes are secured.
The values of timber stands are maintained.
There is downward accountability b .
FMU is financially viable to its owner.
Ecological processes are not threatened by forest management.
HCVFs, buffer zones and riparian set-asides provide biodiversity and other benefits.
a To reassure product consumers, donors, taxpayers, decision-makers and society at large that resources are being wisely used and/or invested (Rogers 2012).b To inform knowledge users (intended beneficiaries and communities) about whether or not and in what ways a program benefits the community (Rogers 2012).
Note: Shaded boxes of the same color represent sets of outputs and outcomes that can inform on more than one dimension of certification (e.g., gateway outcomes; A. Rowe pers.comm.).RIL= Reduced-impact logging.HCVF = high conservation value forest.
term and medium-term effects of an intervention's outputs) towards the goal of responsible forest management (all definitions according to OECD 2002 and Jagger et al. 2010).
A well-crafted theory-of-change may also help clarify which aspects of the intervention are needed and which behavioral and contextual variables need to be addressed to explain variation in the intervention impacts (i.e., impact pathways).In the context of each study site, the theory-ofchange helps to reveal the particular roles and interactions among contextual factors (e.g., time of first implementation of certification; existing legal frameworks; capacity and technical knowledge of the staff; timber market characteristics) and the associated actions related to the certification intervention (e.g., involvement of local stakeholders; improvements in timber harvesting practices).The resulting models of change, tailored to particular regions and locales, should reflect specific issues that need to be tracked to assess the impacts of certification.To date, none of the existing natural product certification systems has an explicit and available theory-of-change that describes the change process, but Fairtrade recently started working on one (Nelson and Martin 2011).
An example related to the implementation of measures within FSC's Criterion 4.1 ('Forest management should meet or exceed all applicable laws and/or regulations covering health and safety of employees and their families:' FSC 2012) is used here to guide the elaboration of a theoryof-change (Figure 3).In this example, there are various sources of motivation for FMUs to reduce work-related injuries, including: compliance with national regulations (purple color; Figure 3); pressure from worker unions, trade associations or stockholders; public awareness; concerns about corporate reputation; genuine concern about worker welfare; and, the need to comply with FSC criteria.A proper impact evaluation design will help pose research hypotheses to tease out the effect of these and other factors.and FMUs obtain incentives that more than compensate for their certificationrelated efforts.

INPUTS
Note: Subsequent changes in impacts will depend on how risks and assumptions are considered during implementation of the certification intervention.

FMU = Forest Management Unit HCVF = high conservation value forest
In parallel with formulation of a theory-ofchange for certification, a participatory analysis of risks and assumptions is needed (i.e., necessary conditions for change; Wigboldus and Brouwers 2011;Stein and Valters 2012).This analysis can reveal factors that might hinder the transition from Outputs to desired Outcomes.Also, assumptions about how the biophysical, social, economic and policy change processes occur, as well as specification of the impact pathways, all need to be explicit and addressed throughout the evaluation.Some apparent risks and assumptions for FSC certification are presented in Table 4.

Process evaluation
Process evaluations (i.e., implementation evaluations; Weiss 1997; Rossi et al. 2004;SCR 2012) seek to determine whether all specified steps for certification were implemented as specified by the intervention.They also seek to ascertain if the intervention was implemented differently in different places (Rogers 2012).Steps in the certification process include preparation of a management plan as specified by the FSC certification standards; implementation of practices in accordance with the management plan; and, most critically, verification of compliance with the standards through an audit (this process corresponds to the light grey portion in Box 2).For a proper process evaluation, it is thus critical to know a great deal about the company, the auditors, auditing processes and certifying bodies.
Certification audits include desk and field assessments, as well as consultations with stakeholders (e.g., public consultations).Evaluators of forest certification will need to understand how certification guidelines are interpreted and implemented in the field by the auditors and how auditor performance is verified by the certifying bodies.They will also need to be familiar with how FSC conducts its annual audits of certification bodies, a process that includes random checks of documentation as well as field verifications.
Given the critical importance of the judgments of auditors to the success of certification, more needs to be known about who they are, the conditions and constraints under which they work, and how they make decisions (Cashore and Auld 2012).
Accreditation Services International (ASI, established in 2006; www.accreditation-services. com), which provides third-party accreditation of certification bodies, represents an additional layer of verification of compliance with FSC guidelines.Before 2006, the FSC accredited directly its certifying bodies.Currently, ASI determines the competence of certifying bodies through the review of secondary information, periodic visits to the offices of the certifying body and witness assessments (observations of auditors by ASI personnel).Accreditation of a certifying body involves an initial assessment, annual surveillance to verify implementation of recommendations, and visits to certificate holders by ASI representatives to verify compliance (i.e., an audit of the audits; ASI 2010, 2011).

Evaluation questions
Generally, evaluations address high-level impact questions (e.g., did certification deliver its intended impacts?), as well as questions more centered on the parties affected by the intervention (e.g., for whom and under what conditions did certification work?).They also consider both positive and negative outcomes.More scaled-down questions address types of impacts and their distributions.For example, it is important to know how the impacts of certification have changed over time.Also needed is information about the influence of other factors on certification impacts (e.g., how did certification interact with other initiatives to achieve or fail to achieve desired outcomes?).More fundamentally, it needs to be clear how, why and where the intervention worked, whether there was variation in implementation of the certification process, and to what extent any differences in impacts can be explained by this variation (adapted from Rogers 2012; Ton et al. 2012).
Further grounded questions that might be linked to particular places of interest are how certification schemes and certifying bodies compete among themselves and how such competition affects the performance of the FSC certification (e.g., to avoid having to make required changes, some FMUs might prefer to switch to CBs within FSC or switch from FSC to an alternative certification system).The specific questions to be asked through the empirical evaluations should emerge from consultations with a range of relevant stakeholders from each site, and might represent a refined list of those mentioned above.
T his section describes information needed to inform discussions and negotiations related to the design of the evaluation of FSC certification impacts.Specific analyses of certain themes may shed light on critical components to inform that design; detailed information on these issues is presented.Results of these analyses are meant to provide a foundation for carrying out the evaluation.

Knowledge about the forest sector
An evaluation of the impacts of forest certification should take into account that, even in the complete absence of this intervention, FMUs vary in the quality of their management.For example, some may employ practices that closely match certification requirements (e.g., employ reduced-impact logging techniques and collaborate with neighboring communities).As a further complication, these practices vary over time in response to a wide variety of factors of which certification is only one (e.g., extent of enforcement of forest regulations, timber market dynamics).It should also be noted that the process of certification involves many steps over several years (e.g., Ruslandi et al. in press) along with many possible shortcuts, particularly if the FMU secures support from NGOs.It is also important to note that both getting and remaining certified requires substantial investments of time, money and human resources.Currently certified FMUs vary broadly in when they were first certified as well as in the numbers and types of adjustments in management practices required to obtain or retain their certificates.For example, a study of five certified forest concessions in Kalimantan, Indonesia, revealed that it took from 3 to 10 years to achieve certification (Ruslandi et al. in press).
Variation in management practices and certification status can be portrayed as a continuum -from those FMUs that employ sub-standard management (sensu latu) practices and have no interest in certification to those already certified through several rounds of audits (Figure 4).Some temporal dynamics in management practices and certification status (e.g., certification lost, certification pending) are also possible and are thus represented along this continuum (based on previous research of forest sector evolution and legal regimes in the tropics; Ruiz-Pérez et al. 2005;Salazar and Gretzinger 2005;Pereira et al. 2010).Of particular relevance are changes in macroeconomic conditions, currency exchange rates and land tenure arrangements (Cattaneo 2001;Wunder 2005;Arcand et al. 2008;Sunderlin et al. 2008;Banerjee and Alavalapati 2009).A typology that captures where FMUs are on this continuum is needed to inform the design of empirical evaluation of impacts.Such a typology will also provide information needed to assist in the selection of FMUs as counterfactuals and will generate insights about factors that might directly or indirectly influence the impacts of certification.To this end, up-to-date information about the pool of existing FMUs and the contexts in which they operate needs to be compiled.The variables used to generate the typology are based on previous efforts at characterizing forest sectors (Table 5).

Knowledge about the temporal dynamics of certification
The position of an FMU along the certification continuum (Figure 4) can change over time in response to political economy factors related to the timber and other sectors, investments, opportunities, market realities and other drivers.In particular, contextual factors that operate at local, national and international levels can influence an FMU's decisions to opt for certification and, once certified, to remain so.Market dynamics (e.g., consumer preferences and acquisition power) change and, in turn, influence suppliers' decisions vis à vis certification.Shifting legal frameworks and their enforcement, technical capacities, technological innovations, global/regional/national economic conditions and cost-benefit ratios are among the factors that can affect FMU decisions 4 Knowledge needs for the design of an evaluation of the impacts of forest management certification | 21 about remaining certified or not (Nebel et al. 2005;Kollert and Lagan 2007;Crow and Danks 2010;Chen et al. 2011b).Substantial understanding of these dynamics will reveal the true impacts of forest management certification.Proposed categories are Never Certified, Considering Certification, Enroute to Certification, Certified, Certification Failed and Certification Lost (Figure 5).By Never Certified we refer to FMUs with no apparent interest in certification and for which certification, under current contextual conditions, might represent too much of an investment risk or a waste of resources.
Understanding the contextual factors that influence forest management decisions in general and certification in particular should help explain  Note: The certification continuum superimposed on an axis of responsible forest management reflects different stages related to certification (ovals).An FMU that loses its certificate might end up at different places along the axis depending on whether it continues to employ the practices required for certification.It is worth noting the possibility that FMUs not interested in certification employ responsible management practices that lead to the maintenance of the biophysical, social, economic and political values of forests (Box 2).Also, FMUs working towards certification employ a range of different quality practices (ghosted ovals).domain (i.e., internal such as costs, expertise and knowledge vs. external such as consumer confusion and supply-chain structure); and their dynamics should also reveal much about FMU transitions along the certification continuum in Figure 4. In-depth understanding of FMU decisionmaking processes about certification will reveal the relative roles of factors such as legal contexts, technology access and information availability.Diverse factors influence a FMU's decision to seek and retain certification; they differ in their effects among countries, regions and types of FMUs, and vary over time.FMUs make decisions about certification based on the economic and other sorts of costs and benefits (Richards 2004).Key inputs for this assessment are the results of the typology and data on the dynamics of the forest sector (discussed in the previous sections), which will provide opportunities for comparing information obtained with contextual factors.A recent report that proposes a typology of FMU attitudes towards certification (SCR 2012) offers valuable reflections on self-selection.These attitudes, in combination with other factors (e.g., estimated cost-benefit ratios) influence FMU decisions regarding certification.This framework is used to outline current knowledge about self-selection into certification (Table 6).
Several studies have attempted to assess the direct and indirect costs of certification as a potential barrier to participation.Some direct costs vary with baseline FMU management practices.For example, FMUs that employ good management practices will need to invest relatively little to satisfy certification standards.But even in relatively well-managed FMUs there are often deficits in the knowledge, skills and practices needed to reach certification standards.Apart from improvements in management practices and record-keeping, other costs include equipment needs, staff training and additional salaries, audit and membership fees, monitoring and record-keeping, and consultation processes, some of which vary with FMU size (Cashore et al. 2006;Gale 2006;Roberts 2012).
The financial costs of certification vary substantially.In a study of five FSC-certified concessions in Indonesia, for example, Ruslandi et al. (in press) estimated costs of certification of USD 300 000 -700 000 per concession with averages of USD 4.76/ ha and USD 0.1/m 3 , half of which was covered by outside agencies.In general, these costs are lowest for FMUs that already employ sound management practices, which is why they may be among the first to seek certification (Blackman and Rivera 2011;Blackman and Guerrero 2012).Likewise, due to economies of scale, the ability to cover direct and indirect costs (and thus effectively join certification schemes) increases with the size of the FMU (Nussbaum et al. 2001;Rivera 2002;Roberts 2012).In recognition of this bias against small FMUs, many of which are community run, the FSC started the Small-and Low-Intensity Managed Forest (SLIMF) program in which FMUs pay reduced rates for initial and follow-up audits (e.g., Brazilian FSC SLIMF standard, FSC 2013).

Strategic risk management
-Use certification to buffer market risks.
-Reduced operating costs due to improved forest management operations and better trained personnel.
-Enhanced learning and transparency.

Knowledge about the political economy of the forestry sector
A final issue related to the evaluation of certification pertains to the contexts in which FMUs are embedded.Certification is applied in a diversity of places that vary in characteristics that can affect the outcomes of the intervention and how they change over time.Examples include governance issues such as changes in legal frameworks and their implementation, as well as changes in the policy environment (e.g., timing of elections, changes of government); social issues related to changes in capabilities and extents of participation; economic factors that vary with market structures and preferences, supply chain structure and dynamics, technology and availability of capital; and biophysical aspects of forest management that range in resource availability as affected by harvesting history, intensity, disease, fire and climate change.
Conversely, it is also important to consider the direct and indirect impacts of the certification intervention in the context in which it is applied.For example, certification principles inspired national forest policy-making in Bolivia (Nittler and Nash 1999), while forest certification contributed to enhanced forest law compliance in Cameroon (Cerutti et al. 2011).In Russia, certification helped to build adaptive capacity by increasing knowledge of local stakeholders through more participative forest management processes (Keskitalo et al. 2009).Thus, a solid grasp on how policies and programmes evolve and shape decisions regarding forest management is needed; this can help to adequately contextualize the outcomes of forest management certification.
Proper understanding of the political economies in which forest management occurs may also aid the design of robust evaluations of certification.
Knowledge about land use change Blackman (2012) and Gaveau et al. (2012) point out that remote-sensing techniques can help measure how certification reduces deforestation at a diversity of scales.For example, certification might affect the probability of deforestation at the FMU level, at the level of a company that manages several FMUs or at the jurisdictional level where it is located; this is one of the objectives of certification (Viana et al. 1996).Availability of cloud-free images remains a challenge in some parts of the tropics, but the approach is well-established and regional expertise at least is generally available.The impacts of construction of logging roads, opening of skid trails and even felling can also be monitored remotely.However, the required techniques (e.g., ClasLite: www.claslite.stanford.edu)are sophisticated and will not be considered further here.
Remote sensing can reveal changes in forest cover in FMUs along the certification continuum, but a theory-of-change is needed to elucidate causes for observed changes.For example, lack of governmental support for enforcement of FMU rights or high costs may constrain FMUs from stopping deforestation (Coleman and Steed 2009;McElwee 2010;Amacher et al. 2012).Conversely, a remote-sensing study might miss a more insidious cause of forest loss: when boundaries of FMUs are redrawn to allow portions to be cleared legally.Such an impact will be revealed only if both the original and final FMU boundaries are known.Remote sensing will also not reveal the causes of deforestation within FMUs that retained their certificates.Finally, remote-sensing studies of the effects of certification on deforestation still need appropriate counterfactuals; these would determine whether or not deforestation would have happened in the absence of certification.As such, to infer the causal effects of certification on observed land use changes, remote-sensing evidence needs to be interpreted with reference to the full set of contextual and site-specific characteristics used in the forest typology and the construction of a theory-of-change.
Previous efforts at assessing certification impacts on deforestation used panel data from the Food and Agriculture Organization of the United Nations (FAO) for 1972-1994 and 2005-2010 for developed and developing countries (Damette and Delacote 2011) F orest certification is a private, voluntary, market-driven instrument designed to promote responsible forest management.While many certification systems operate around the world, this paper focused on certification of natural forest management in the tropics by the Forest Stewardship Council (FSC).The FSC certified its first tropical forest in 1994 and has certified more than 100 other natural forest management units (FMUs) in the tropics since then.However, the often-claimed environmental and social benefits of certification remain to be empirically evaluated.After reviewing the literature on the impacts of certification, a foundation is laid here to develop an evaluation approach of what many observers claim as the most effective tropical forest conservation intervention ever initiated.The next stage is to critically analyse and synthesize information from studies outlined in the previous section and compile lessons learned and knowledge gathered into a design for an evaluation.
To be credible, salient and effective, an evaluation of the impacts of certification needs to be participatory from the onset.Input from the full gamut of relevant stakeholders coupled with compilation of the salient biophysical and socioeconomic characteristics of certified and other FMUs will form the basis for designing various formal methods of evaluation of this complex intervention.Among these activities, emphasis should be given to formulating a series of theoriesof-change for the intervention that captures contrasting views and angles to certification.
The evaluation framework proposed here aims at tracking variation in the quality of implementation of certification (i.e., process evaluation).Likewise, it integrates results of empirical research that tests hypotheses motivated by how specific contextual factors shape certification outcomes.For example, large concessions, private landowners and communities are among the FMUs that have opted for certification.This basic information organized in a typology of FMUs that will serve to produce a better understanding of dynamics in the certification sector and the associated analysis of the self-selection process of FMUs into and out of certification (or switches among certification schemes/bodies).In-depth, historical political economy appraisals of the forest and timber sectors help to analyse contextual factors and other elements exogenous to the firms that manage FMUs and to the FMUs themselves.The resulting studies are meant to explain behaviors observed in the certification dynamics and selfselection analyses, to formulate hypotheses to guide the evaluation, and thus to justify the evaluation design.
Finally, a diversity of stakeholders -from representatives of local and regional communities and governments, environmental and social NGOs to FMUs at all levels of decision-makingshould be invited to contribute to, and to benefit from, the knowledge gained from an evaluation of forest management certification.After all, the framework proposed here is only the first step towards the design and future implementation of evaluation research in the context of tropical forest certification on a global basis.Ultimately, it is hoped the research framework proposed will contribute to learning from past mistakes, building on lessons learned and enhancing decision-making towards the maintenance of forest values over the long term and for the benefit of society as a whole.

Center for International Forestry Research (CIFOR)
CIFOR advances human well-being, environmental conservation and equity by conducting research to help shape policies and practices that affect forests in developing countries.CIFOR is a member of the CGIAR Consortium.Our headquarters are in Bogor, Indonesia, with offices in Asia, Africa and South America.

cifor.org blog.cifor.org
The often-claimed environmental and social benefits of forest certification remain to be empirically evaluated.Despite numerous publications on the impacts of tropical forest certification, virtually all are based on secondary sources of information and not on field-based measurements.This paper proposes an empirical research framework for a carefully designed field-based evaluation of the ecological, social, economic, and political impacts of tropical forest management certification taking into account location-specific contextual factors which shape certification outcomes.The paper also suggests that solid methodological quantitative and qualitative approaches be used to build proper counterfactuals on which to base the comparisons for inferring impacts, all informed by a thorough theory-of-change and through processes that bring stakeholders together.The proposed research framework represents a first step towards the design and future implementation of evaluation research in the context of tropical forest certification on a global basis.It is hoped the research framework proposed contributes to learning from past mistakes, building on lessons learned and enhancing decision-making towards the maintenance of forest values over the long term, and for the benefit of society as a whole.
CIFOR Occasional Papers contain research results that are significant to tropical forest issues.This content has been peer reviewed internally and externally.

Figure 3 .
Figure 3. Example of a specific theory-of-change that accounts for reduction of work-related injuries in a certified FMUNote: Boxes with dashed outlines represent assumptions behind the transition from one step to the next in the sequence of actions that lead towards the long-term goal of reduced work-related injuries.Dashed boxes could also be channels for certification impacts.For example, the law may nominally require safety plans and safety equipment (purple box and arrow), but it is certification that ultimately encourages the FMU to require/provide incentives for workers to use the equipment.

Figure 4 .
Figure 4.The certification continuum This research was carried out by CIFOR as part of the CGIAR Research Program on Forests, Trees and Agroforestry (CRP-FTA).This collaborative program aims to enhance the management and use of forests, agroforestry and tree genetic resources across the landscape from forests to farms.CIFOR leads CRP-FTA in partnership with Bioversity International, CIRAD, the International Center for Tropical Agriculture and the World Agroforestry Centre.

Table 1 . National standards for forest management certification in tropical countries Figure 1. Extent of natural forest (in ha) certified by the Forest Stewardship Council (FSC) as of March 2013
Note: Number of certificates per country noted parenthetically.Source: Forest Stewardship Council.
In the case of certification, many variables that affect the maintenance of forest values are likely to also influence the probability of participation in certification.These variables could include FMU area, market orientation and institutional arrangements such as community based, private or public (for more examples see Table2).Exact matching entails tradeoffs between the stringency of criteria used to select counterfactuals (e.g., the number of matching variables used) and the number of possible replicates.Quantitative studies on the impacts of conservation interventions should be complemented by qualitative insights gained through both a variety of techniques and available secondary information to triangulate (Rosenbaum and Rubin 1983)actuals are rare(Rowe 2012), a comparison group usually needs to be constructed on the basis of variables expected to affect the outcomes of the intervention.A diversity of statistical techniques (e.g., instrumental variables -IV; regression discontinuity designs -RDD, matching) are used to analyse these situations.(RosenbaumandRubin 1983).More recently, other approaches have been used where selfselection poses methodological problems (e.g., encouragement designs: Bradlow 1998; West et al. 2008; endogenous switching regression: Kleeman and Abdulai 2012).evidence (i.e., addressing the same research question from different perspectives; Ton 2012).Also, governmental and other data are needed to establish the external validity of the certification intervention model.Among the qualitative approaches to evaluation are exploratory and participatory methods.Exploratory methods (e.g., general elimination methodologies, process tracing and contribution analysis) can help explain what occurred and how and thereby help infer causality (Bamberger et al. 2009; Collier 2011; White and Phillips 2012).Participatory approaches rely on stakeholder perceptions to examine behavioral change, establish attribution and suggest how interventions can be improved (e.g., most significant change, the success case method, outcome mapping, stakeholder and multicriteria analyses, participatory impact assessment and participatory social mapping; Rogers 2008; Chambers 2009; Rogers 2009; Schreckenberg et al. 2010; White and Phillips 2012).Qualitative approaches should still be based on a comparative framework such as before-and-after comparisons of both treated and control groups or interrupted time series (GAO 2009).
a function of the relative importance of the forest sector (e.g., high level of timber production and proportion of GDP; Damette and Delacote 2011).
(Molnar 2004;Corso et al. 2008)006;6; Karmann  and Smith 2009).Other studies examined the apparent impacts of certification on different aspects of management, including community forestry(Molnar 2004;Corso et al. 2008), community enterprises (Butterfield et al. 2005), biodiversity (Walrecht et al. 2012) and changes in forest cover as (Rametsteiner and Simula 2003;Newsom and Hewitt 2005;Newsom et al. 2006;Peña-Claros et al. 2009tion requests (CARs) issued by FMU auditors from CBs.The FMUs are typically given deadlines to address each CAR; in the case of major CARs, failure to correct the identified problem within the specified time period results in refusal to grant the certificate, or else its immediate suspension.Reviewing the assignment and satisfaction of CARs provides insights into the nature of problematic management issues and their evolution, as well as indirect evidence of improved practices(Rametsteiner and Simula 2003;Newsom and Hewitt 2005;Newsom et al. 2006;Peña-Claros et al. 2009).
(Nussbaum and Simula 2004;Bartley 2007;Peña- Claros et al. 2009;van Kuijk et al. 2009of CARs are based on process and not performance.This means the issued CARs are indirect, but of key importance, to the maintenance of forest values(Nussbaum and Simula 2004;Bartley 2007;Peña- Claros et al. 2009;van Kuijk et al. 2009).

Complementarity of the impact and process evaluations of certification that together facilitate the generation of new knowledge in a participatory manner (i.e., social learning) and frame adaptive management of forests
Note: One result is improvements in forest management through enhanced decision-making processes and arrangements (e.g., boosted accountability, more appropriate policy arrangements and institutions, increased knowledge).Experiments in policy experimentation (e.g., adaptation and design of new interventions) for resource management and continued learning represent the foundation for adaptive management (dashed black line).

get certi ed THEORY-BASED IMPACT EVALUATION PROCESS EVALUATION Forest Values Maintained BAU MP
g., unpredictable markets) and clarifiesFurman 2005;Rugh et al. 2010;White 2010;Gertler et al. 2011;Rogers 2012)technical;Furman 2005;Rugh et al. 2010;White 2010;Gertler et al. 2011;Rogers 2012).It specifies the inputs (i.e.outputs), outputs (i.e., the products, capital goods and services that result from the intervention, as well as any changes resulting from the intervention that are relevant to the achievement of outcomes) and outcomes (i.e., the expected or achieved short-

Table 5 . Variables that affect forest management Type of characteristic Attribute Exogenous to the Firm
: The table illustrates examples of variables likely to influence the short-and long-term outcomes of forest management and thus the impacts of certification.Data collected with this framework will be used to construct typologies of FMUs and thus to inform the dynamics of certification and the process of self-selection into the scheme. Note

Contextual and other factors that influence FMUs' management decisions
Possible indirect costs of certification not considered above derive from foregone harvest Note: Contextual and other factors (e.g., retention of trained workers; availability of capital) influence the choices FMUs make about certification over time.Arrows represent the transition probabilities (or proportions) of FMUs that remain in each category (curved) and that move into other categories (straight including dashed lines) during a given period of time.

Table 6 . Typology of attitudes regarding forest management
. Potentially explanatory variables included in the analyses were the volumes and values of harvested timber.Control variables included national-level indicators of institutional quality (i.e., indices of political rights and civil liberties); a model of deforestation as a function of the country's GDP; annual GDP growth rate; population density; and the country's forest cover at the beginning of the study period.Results from this study suggest that countries in which FSC certification was prominent experienced less deforestation (Damette and Delacote 2011), but did not address the causal mechanisms that could explain this outcome.