Methods and projects / Working papers

La colección de Documentos de Trabajo del INE tiene como objetivo la difusión de trabajos originales de investigación relacionados con la actividad propia de una oficina de estadística y desarrollados por el personal de INE con la posible colaboración de investigadores de otras instituciones.
Los trabajos están orientados al desarrollo de aspectos metodológicos o al análisis y estudio de los resultados de las operaciones estadísticas oficiales desde distintas perspectivas.

The collection of INE Working Papers is intended for disseminating original pieces of research related to the activity performed by a statistics office and developed by INE personnel with the possibility of collaboration of researchers from other institutions.
Work is geared towards developing methodological aspects or analysing and studying the results of official statistical transactions from different perspectives.

On Admissibility in Bipartite Incidence Graph SamplingPedro García Segador, Li-Chun Zhang
- Doc.
  13/2024
  
  Resumen / Abstract
  In bipartite incidence graph sampling, the target study units may be formed as connected population elements, which are distinct to the units of sampling and there may exist generally more than one way by which a given study unit can be observed via sampling units. This generalizes finite-population element or multistage sampling, where each element can only be sampled directly or via a single primary sampling unit. We study the admissibility of estimators in bipartite incidence graph sampling and identify other admissible estimators than the classic Horvitz-Thompson estimator. Our admissibility results encompass those for finite-population sampling.
  
  Palabras clave / Key words
  graph sampling, admissibility, sufficiency, Rao-Blackwellization.
  
  Documento / Document
Modelo de asignación de valores de renta media en el ADRH para la protección del secreto estadístico en municipios de poca poblaciónÁlvaro Amasuno Arrebola, Ángela Martín Jiménez, Manuel Refoyo Mayoral, Eloy Vergara Gómez
- Doc.
  12/2024
  
  Resumen / Abstract
  La estadística Atlas de Distribución de Renta de los Hogares, que publica anualmente el INE, provee de información de renta muy detallada a nivel geográfico. Para preservar el principio de protección de la confidencialidad de la información que establecen las leyes nacionales y europeas, se han ido aplicando, en ámbitos de poca población susceptibles de protección, métodos de asignación de valores de renta. En este documento se presenta y evalúa un nuevo método de asignación para estos ámbitos, que se implanta a partir de la edición del ADRH 2022, publicada en 2024, y que mejora sensiblemente el implementado hasta la anterior edición.
  
  Palabras clave / Key words
  Atlas de Distribución de Renta de los Hogares, secreto estadístico, comarcas agrarias, municipios pequeños, renta bruta.
  
  Documento / Document
Value Added and Employment in Spanish Exports: An analysis Based on FIGARO DataJuan Cervigón, Jorge Novalbos, Maribel Parra, Sixto Muriel
- Doc.
  11/2024
  
  Resumen / Abstract
  Exports play a crucial role in the Spanish economy, contributing significantly to the gross domestic product and employment. This study analyzes the value added and employment generated by Spanish exports during the years 2010 to 2022, using data provided by FIGARO (Full International and Global Accounts for Research in Input-Output Analysis), a database developed by Eurostat that offers detailed information on the input-output relationships between countries and economic sectors. The research focuses on evaluating how Spanish exports contribute to the creation of domestic value added and the generation of employment, both directly and indirectly, through global value chains. It examines the evolution of these indicators during the period from 2010 to 2022, with the final years marked by the COVID-19 pandemic and the subsequent economic recovery. The study examines the distribution of value added and employment generated by Spanish exports across different branches of activity and identifies the main destination markets and their evolution during the analyzed period. The work concludes by highlighting the strategic importance of exports for the Spanish economy, not only as a source of economic growth but also as a driver of quality job creation and the improvement of international competitiveness. It emphasizes the need to develop such statistics that harmonize information from different countries for the development of public policies that promote business internationalization, innovation, and adaptation to new global trends, such as digitalization and the green transition. This analysis provides valuable information for policymakers, entrepreneurs, and academics interested in understanding the dynamics of Spanish exports and their impact on the national economy. The results obtained can serve as a basis for designing strategies that enhance the value added and employment associated with export activities, thus contributing to the sustainable economic development of Spain in an increasingly competitive and changing global context.
  
  Palabras clave / Key words
  Value Added, Employment, Exports, analysis, FIGARO Data
  
  Documento / Document
Measuring the quality of administrative sources: at macro level with novel indicators and micro level with distributions comparisonAlicia Nieto, Sandra Barragán, Alba Rodríguez, Soledad Saldaña, David Salgado
- Doc.
  10/2024
  
  Resumen / Abstract
  In the production of official statistics there are three main data sources: surveys, administrative registers, and (privately held) digital sources. The use of administrative sources is lately increasing, however there is a lack of control in the quality of these new sources. The administrative data can be used in different ways, the most challenging is to use them as primary source of data, directly or indirectly to compute the target aggregates.
  The advantages of administrative data as a primary source are widely known improving different quality dimensions (reduction of response burden, cost savings, increase of granularity, etc.), but the disadvantages must be considered carefully. In terms of the representation and measurement lines in the Total Survey Error paradigm and the Two-Phase Life-Cycle model by Zhang, errors both related to units and to variables are present. Coverage errors arise when identifying units in the target population and validity errors proliferate because of the differences between concepts for statistical and administrative purposes. Therefore, a need to measure the quality of input data emerges as a consequence of the data generation process lying out of the control of NSIs.
  In official statistics several quality and performance indicators are used but the focus is on measuring the quality of the output, and most of them have been designed to be used when using survey data as input. So, there is a need to broaden the list of quality indicators to provide room for those quality measures of multisource statistics and even more in the case of statistics based only on administrative data.
  At Statistics Spain we are carrying out an exercise to measure the quality of the administrative data used in several short-term statistics of different domains/characteristics. In this work we present the proposal to measure the quality of the input with some indicators for the administrative data source. Moreover, we take advantage of the access to both administrative and survey variables for a part of the sample to directly compare the distributions of the target variable under study.
  The ultimate goal is to provide objective measures of the direct use of administrative values without further treatment to gain some knowledge about the quality of the final estimates in comparison with fully survey-based traditional results. This analysis may help us decide regarding the need of treatment of administrative sources to keep under control their disadvantages and ensure the quality of admin-based final outputs.
  
  Palabras clave / Key words
  quality indicators, administrative data, survey-admin comparison
  
  Documento / Document
Improving quality in seasonal adjustment in Short-Term Statistics using JDemetra+ regressors and TEAM R-packageFélix Aparicio Pérez, José Fernando Arranz Arauzo, C. Gómez, E. Rosa-Perez, C. Sáez Calvo, L. Sanguiao Sande, María Teresa Vázquez Gutiérrez
- Doc.
  09/2024
  
  Resumen / Abstract
  Short-term business statistics (STS) are the earliest statistics released to show emerging trends in the European economy. Monthly and quarterly STS provide data for the main economic sectors: industry, construction, trade and services, excluding financial and public services.
  STS Regulation requires data that are calendar adjusted and calendar and seasonally adjusted, in addition to unadjusted data. Seasonal adjustment (SA) procedures eliminate the estimated seasonal and calendar effects from the original time series and obtain SA-estimates that are likely to reveal what is new in a time series.
  JDemetra+ is the seasonal adjustment software officially used in the European Statistical System. Among other methods, it allows the use of a model-based TRAMO-SEATS approach for performing seasonal adjustment of time series. In this approach, a RegARIMA model is fitted to the series. It also offers the chance to calculate regression variables to model calendar effects, including trading day regressors that take into account the composition of the days of the month.
  ARIMA models used to adjust STS time series play a very important role to obtain accurate adjusted data. But sometimes the work of updating the ARIMA models can become a burdensome task as the manual identification of a suitable model can become complex and time-consuming and automatic procedures can provide a model without taking into account some restrictions considered as essential for the domain expert.
  Time-Series Exhaustive Automatic Modelling (TEAM) is an R package developed by Statistics Spain, and based in the JDemetra+ ecosystem, that can help in the process of updating ARIMA models by providing a list of models ordered according to a global score.
  Methodologically, we can set a priori specifications (outliers, calendar regressors, maximum/minumum values of the ARIMA parameters), and the local scores by hierarchical levels can help us guarantee the quality. In this sense, one of the main advantages of TEAM is that the ARIMA models provided can be subject to some restrictions specified by the domain expert.
  To carry out calendar adjustment at Statistics Spain we have been using customized working days regressors. However, for time series of specific activities the residual effects of trading days were not being completely removed. Several analyses have been undertaken and improvements have been achieved when using JDemetra+ regressors.
  Due to the increase in quality, the JDemetra+ regressors and the use of the TEAM package to update the ARIMA models are included in the seasonal adjustment process as from January 2024 in STS at Statistics Spain.
  
  Palabras clave / Key words
  seasonal adjustment, JDemetra+, R, Time-Series Exhaustive Automatic Modelling
  
  Documento / Document
Integration of administrative and survey data in a Short-Term Business Statistics with statistical learning algorithmsSandra Barragán, David Salgado, Sergio Pardina, Esther Puerto
- Doc.
  08/2024
  
  Resumen / Abstract
  The use of administrative data (and digital data sources) is a must not only for the modernization of the production of official statistics but also for keeping relevance in the new data and AI international ecosystem. These new data sources must be integrated with survey data. However, as it is widely known, this incorporation of new data sources does not come without quality challenges. By and large, the direct substitution, use, or aggregation of administrative data cannot be undertaken since errors both in the representation and measurement lines arise even when formerly they were under control using only survey data.
  Representation errors (especially regarding coverage) arise because of unit misclassification errors and other factors. Validity, measurement, and process errors easily occur because of the administrative (non-statistical) purposes of these data sources. Overall, the fact that the data generation mechanism lies outside the control of the statistical process revives both non-sampling errors (validity error, for example) and inferential challenges (non-ignorability, for instance).
  We present a proposed end-to-end statistical production process integrating administrative data with survey data in a probability sample. Synthetic values produced from a tax source are computed using a statistical learning model so that validity and measurement errors can be a priori identified and kept under control. The statistical learning algorithm learns from past and present survey and administrative data producing high-quality values for non-influential units, which paves the way to reduce response burden. Influential units are still integrated using survey data.
  We share a proof of concept on the monthly Services Sector Activity Indicators using VAT data. We discuss challenges regarding the quality of both the sampling design, the statistical model, and the training data.
  
  Palabras clave / Key words
  data integration, administrative data, end-to-end statistical production process
  
  Documento / Document
Time-Series Exhaustive Automatic Modeling: a new methodology for model identificationFélix Aparicio Pérez, José Fernando Arranz Arauzo, Carlos Sáez Calvo, Luis Sanguiao Sande, María Teresa Vázquez Gutiérrez
- Doc.
  07/2024
  
  Resumen / Abstract
  Seasonal adjustment of time series plays a pivotal role in modern official statistics, ensuring accurate and reliable data analysis. However, due to resource constraints and time limitations, the models identified in an automatic way using the current software may not be optimal. This leads to a worse performance of seasonal adjustment, since these models must be maintained for a year.
  We present a new R package, Time-Series Exhaustive Automatic Modeling (TEAM), which aims to automate and enhance the yearly model identification phase. The goal is to provide in an automated way a list of optimal models, where the optimality criteria can be specified by the users to meet their specific needs.
  The methodology employed in TEAM is characterized by an exhaustive search and ranking of models. Initially, an exhaustive search of specifications is conducted for each time series, testing all possibilities for parameters such as data transformations (logarithms or levels), the order of the ARIMA model, inclusion of outliers, and calendar regressors. Subsequently, each specification is processed using the JDemetra+ software in a parallelized way, yielding diagnostic information to construct five indicators assessing the model's performance across distinct areas.
  
  The five indicators and their respective areas of evaluation are as follows:
  1. Model Diagnostics: Measures the model adequacy by using the statistical tests on the residuals of the RegARIMA model and considering the statistical significance of the model coefficients and their autocorrelations.
  2. BIC: Measures the goodness of fit of the model to the data.
  3. Signal Extraction: Measures the model's efficacy in signal extraction using SEATS (via canonical decomposition).
  4. Revisions: The magnitude of revisions when new data is available is captured by this indicator.
  5. Residual Seasonality: This indicator considers statistical tests on residual seasonality after the seasonal adjustment process is performed.
  
  To rank the models effectively, a final score is computed by appropriately combining the five indicators. Importantly, users retain the flexibility to adjust the weights assigned to each area according to their specific requirements. For instance, users could prioritize models with minimal revisions based on their preferences. Moreover, an alternative approach based on the Pareto boundary is also explored. Finally, TEAM presents the user with a selection of the best models based on the final score, enabling them to choose the most suitable model according to their needs.
  
  Palabras clave / Key words
  time series, seasonal adjustment, JDemetra+
  
  Documento / Document
Enhancing economic statistics quality by addressing large multinationals data through a Large Cases Unit (LCU)Sixto Muriel, Iván Pérez-Plaza, Juan Cervigón
- Doc.
  06/2024
  
  Resumen / Abstract
  The accelerating pace of economic globalization has introduced a multitude of challenges for statistical analysis, necessitating innovative frameworks to comprehend its multifaceted dynamics and captured them properly in statistics.
  Firstly, large multinational corporations expand their operations internationally through an increasingly complex network of subsidiaries, which significantly complicates the representativeness and consistency of national business statistics. A precise profiling of the group and the 'enterprise' within the group as a statistical unit for analysis is crucial in statistics based on types and sizes of companies.
  Secondly, the production processes of these large groups are often organized through global operations that are challenging or impossible to accurately reflect in statistical, administrative, and accounting sources. Furthermore, they are usually based on frequent utilization of intangible assets with diffuse economic ownerships.
  The existence of a specialized and trained unit within national statistical offices dedicated to directly engaging with MNEs and analysing their global corporate and operational structure is key. This approach stands out as the most effective and quite possibly the sole method to guarantee a definitive stride towards assuring the quality of national business and macroeconomic statistics (national accounts and balance of payments) in a globalized economy. This paper investigates the pivotal role of Large Cases Units (LCUs) as a strategic means to grapple with the intricate statistical implications of the economic globalization showing the first outcomes of the LCU in the Spanish statistical system.
  
  Palabras clave / Key words
  Globalization, business statistics, profiling, national accounts, balance of payments
  
  Documento / Document
The new wave of privacy concerns and its impact on official statisticsYolanda Gómez Menchón, Ana Cánovas Zapata
- Doc.
  05/2024
  
  Resumen / Abstract
  One of the basic principles of all statistical offices is to guarantee data confidentiality. The first horizontal statistical law in the EU from 1990 was related with the statistical confidentiality, recognised as the main statistical principle in United Nations in 1994 and in the EU in 1997, then, before the existence of the first Data Protection Directive in 1995. The Treaty of Amsterdam "constitutionalized" statistical confidentiality principle in 1999 and it was implemented in Regulation 223/2009 on European Statistics and further elaborated in the European Statistics Code of Practice. In addition, the principle of statistical confidentiality works together with the other statistical principles of impartiality, reliability, objectivity, scientific independence, cost-effectiveness and non-imposition of excessive burdens on economic operators. Therefore, since the very beginning, statisticians have been treating individual data from natural and legal persons with the highest degree of protection in the EU. The question now is, what has changed in the last eight years?.
  A new wave of privacy concerns arrived as a consequence of the challenges coming from the rapid technological developments and globalisation. This ended in a new regulation in the European Union aimed at harmonising personal data protection rules in the Member States (GDPR). Citizens and governments are nowadays more aware and sensitive on this issue. The European Data Protection Supervisor and national and European lawyers interpret the law and activities in a restrictive way and, at the end, official statistics are made pay for the sins of others.
  In this paper we address this issue with the aim to make evident how all GDPR data protection principles are already covered by the EU statistical principles and what could we do to show this reality increasing trust in official statistics. We will also analyse our principles in the light of the use of innovative methods for statistical production, is our code enough?
  
  Palabras clave / Key words
  Statistical law, personal data protection, Code of Practice, institutional environment, legal and privacy issues
  
  Documento / Document
Peer learning in Africa: a partnership for statistical capacity buildingAna Cánovas Zapata, Ana Carmen Saura Vinuesa, Teresa Paradinas Zorrilla, Dominique Francoz, Amalie Skovengaard, Janne Utkilen, Marika Pohjola, Dorota Paraluk
- Doc.
  04/2024
  
  Resumen / Abstract
  Building data capability in Africa is becoming a key objective for many international organizations, focusing efforts and resources in the region. The European Commission has become a key donor in the continent through the establishment of the Pan-African Statistics Programme II (PAS II). This programme aims to support the African integration process by strengthening the African Statistical System (AfSS), to ensure the use of quality statistical data in the African integration decision-making and policy monitoring, and to translate continental priorities at regional and national level.
  The PAS II programme constitutes a great opportunity to contribute to sharing European standards and best practices in statistics with African Union countries and adapting them to the African context. The programme is being implemented via different budgetary mechanisms, one of them is through grants awarded to National Statistics Institutes (NSIs) of the European Statistical System (ESS). It is a novelty in the context of statistical cooperation, since it is the first time that a project on statistical capacity building in Africa is implemented through EU grants to Member States. The European Commission has awarded two grants: one aimed at developing social statistics in African NSIs (SOCSTAF); another directed to developing economic and business statistics (ECOBUSAF). To cover both goals and enable transformation and progress, the creation of efficient partnerships is a key element to work across sectors and to address common needs.
  In order to succeed in the implementation of this programme, two consortiums of European NSIs have been established which constitute an efficient partnership for capability and competence building, taking advantage of the key values of the different stakeholders. This partnership is a fundamental tool to increase the quality of the work, activities and practices of the institutions involved. Something that represents a great added value for the African counterparts is that the transmission of knowledge is implemented through a peer-to-peer approach, allowing for the sharing of good practices already implemented in similar organizations and collaborating to better adapt the procedures to the specific characteristics of the partner institutions. These good practices also include the way of working of the ESS as a supranational system, which could be of interest for the African statistical offices.
  In this paper we analyse the characteristics of this kind of capability model, based on the case study of the ongoing grants implementing PAS II, and share some lessons learnt for the time being of the project.
  
  Palabras clave / Key words
  Capability, capacity building, partnership, statistical cooperation, peer learning
  
  Documento / Document
New statistical resilience or how to survive in the data ecosystemAna Carmen Saura Vinuesa, Yolanda Gómez Menchón, Ana Cánovas Zapata
- Doc.
  03/2024
  
  Resumen / Abstract
  As declared by the Commission "the European data strategy aims to make the EU a leader in a data-driven society. Creating a single market for data will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations", it means that this Strategy necessarily crosses its path with the statistical functions. This could be a threat or an opportunity for the ESS.
  Aware of this situation and taking into account all the previous initiatives already carried out by the ESS, the revision of the Statistical Law has been boosted to tap the potential of the new data sources for official statistics available in the current digitalised society.
  This paper scrutinizes the opportunities, challenges and potential roles for the statistical community in the future implementation of the amended Statistical Law within the context of the new digital and legal ecosystem. It also looks for synergies with the European Data Strategy and related legal acts (Data Governance Act, Artificial Intelligence and Interoperable Europe Act).
  
  Palabras clave / Key words
  Statistical law, institutional environment, digital ecosystem, new roles, legal issues
  
  Documento / Document
How to serve society through official statistics portals: the Spanish SDG indicators experiencePedro Revilla, Antonio Salcedo, Ana Carmen Saura
- Doc.
  02/2024
  
  Resumen / Abstract
  Official statistical portals can serve society in several ways by providing access to high-quality data, supporting evidence-based decision making, and promoting transparency and accountability. The implementation of SDG indicators portals presents special challenges. In the case of Spain, there is an added difficulty since it has a decentralized statistical system, both departmentally and territorially. This paper shows INE experience in building and managing the National Reporting Platform on SDG indicators, and the way in which it tries to follow the principles of the EU Code of Practice. The Platform is proving to be an essential tool for meeting the challenge of monitoring the SDGs.
  
  Palabras clave / Key words
  Code of Practice, Quality Assurance Framework, NRP, SDGs, 2030 Agenda
  
  Documento / Document
Measuring intangible assets in the Spanish economy. Marketing assetsSixto Muriel, Juan Cervigón, Teresa Ortiz
- Doc.
  01/2024
  
  Resumen / Abstract
  The measurement of intangible assets constitutes a field of traditional and current relevance for economic theory and accounting and statistical practice. Growth and productivity analysis models based on production factors (labour and capital) require the incorporation of all existing forms of capital, both tangible and intangible, which underscores the interest in such estimations.
  
  Palabras clave / Key words
  Intangibles, Marketing, Gobalization, Gowth, Productivity, SNA 2025
  
  Documento / Document
Early Estimates of the Industrial Turnover Index using Statistical Learning AlgorithmsS. Barragán, L. Barreñada, J.F. Calatrava, J.C. Gálvez Sáenz de Cueto,J.M. Martín del Moral, E. Rosa-Pérez and D. Salgado
- Doc.
  03/2022
  
  Resumen / Abstract
  We use statistical learning algorithms to improve timeliness of the Spanish Industrial Turnover Index. The main idea is to use a gradient boosting algorithm to make a prediction for every single industrial turnover value not yet collected during the data collection, data editing and estimation phases. Regressors are constructed from the historical unit-level time series, current aggregated turnover moments and quantiles, and aggregated values of related industrial surveys. Accuracy indicators are also computed so that a quantitative trade-off between accuracy and timeliness can be appraised. This mass imputation exercise provides us with a nowcasting proposal which can be readily extended to many similar design-based surveys.
  
  Palabras clave / Key words
  Machine Learning, Statistical Learning, Industrial Turnover Index, timeliness improvement, missing data imputation
  
  Documento / Document
Propuesta para la Elaboración de un Indicador de Calidad de Vida UrbanaAlex Costa, Antonio Argüeso, Dolors Cotrina y Sergio Porcel
- Doc.
  02/2022
  
  Resumen / Abstract
  Versión actualizada (abril 2023): Documento
  
  Versión anterior (abril 2022): Documento
  
  La medición multidimensional de la calidad de vida urbana es un objetivo relacionado con tres tendencias de la estadística oficial de las últimas décadas: aproximar el bienestar de la población, acercarse al territorio, hasta llegar a la realidad urbana, y utilizar datos de origen administrativo. Desarrollar una estadística de esta naturaleza supone aumentar la calidad del sistema estadístico, porque genera una información directamente relevante para el diseño y evaluación de las políticas públicas y porque, además, lo hace en un contexto de eficiencia, por el hecho de trabajar, necesariamente si la referencia es de nivel municipal, con registros administrativos.
  
  Palabras clave / Key words
  indicador de calidad de vida urbana, estandares de la poblacion, entorno urbano, oferta urbana, retos urbanos
  
  Documento / Document
Sistema de Identificación para unidades estadísticas complejasEsteban Barbado Miguel, Pedro García Segador, Pilar Montero Robles, Valentín Llorente García y Miriam Hernandez Valencia
- Doc.
  01/2022
  
  Resumen / Abstract
  En los últimos años, la arquitectura del Directorio Central de Empresas (DIRCE) ha incorporado a los grupos empresariales y las empresas como nuevas unidades estadísticas que son consistentes con el resto de las que ya estaban incluidas en el DIRCE como unidades legales y locales. A nivel global, los grupos de empresas no suelen estar dotados de personalidad jurídica. Tampoco lo son las empresas compuestas por agrupaciones de unidades legales siguiendo métodos puramente estadísticos. Esto implica la no disponibilidad de un identificador legal. Esta situación limita los análisis de la vida de estas unidades a lo largo del tiempo y afectan especialmente a la producción de Indicadores de Demografía Empresarial. Es necesario desarrollar nuevos procesos orientados al diseño de un Sistema de Identificación robusto para unidades estadísticas complejas. En este documento se describe la construcción de este identificador y la gestión dinámica a lo largo del tiempo de identificadores para gruposempresariales y empresas operando en su seno.
  
  Palabras clave / Key words
  Unidades estadísticas, grupos empresariales, empresas, identificador, seguimiento a lo largo del tiempo
  
  Documento / Document
On new data sources for the production of official statisticsD. Salgado and B. Oancea
- Doc.
  01/2020
  
  Resumen / Abstract
  In the past years we have witnessed the rise of new data sources for the potential production of official statistics, which, by and large, can be classified as survey, administrative, and digital data. Apart from the differences in their generation and collection, we claim that their lack of statistical metadata, their economic value, and their lack of ownership by data holders pose several entangled challenges lurking the incorporation of new data into the routinely production of official statistics. We argue that every challenge must be duly overcome in the international community to bring new statistical products based on these sources. These challenges can be naturally classified into different entangled issues regarding access to data, statistical methodology, quality, information technologies, and management. We identify the most relevant to be necessarily tackled before new data sources can be definitively considered fully incorporated into the production of official statistics.
  
  Palabras clave / Key words
  Digital data, administrative data, Big Data, official statistical production
  
  Documento / Document
The ESA 2010 pension table: An integrated view on the functioning of pension systems in SpainSixto Muriel de la Riva, Carlos J. Valero Rodríguez, Andrés García Carreira
- Doc.
  01/2019
  
  Resumen / Abstract
  The inexorable impact of the population ageing, the peculiarities of pay-as-you-go pensionschemes of public systems and the increasing role played by private systems in developedsocieties emphasize the need of a harmonized measure of accrued ¿to date pension rights andobligations in them as one the main priorities for the statistical systems. Current national accountsstandards (SNA 2008 and ESA 2010) already include guidelines for the registration in their systemsof all employment related private pension obligations/rights regardless of whether they aresystems with or without constitution of reserves. In addition, they propose the recording of allpension schemes, including contingent obligations/rights accrued in public systems in asupplementary table. The supplementary table on accrued-to-date pension entitlements in socialinsurance will allow us to see the evolution of all pension rights stocks and the flows that motivatetheir variations, regardless the fact they are non-contingent financial assets/liabilities for thehouseholds/pension managers or not. Both the objectives and data compiled in the table presentobvious conceptual difficulties and require a high level of expert knowledge in the financial,insurance and actuarial fields. Thus, in the Spanish case, the close collaboration with externalagencies from various areas has been a basic component of the project, as a clear example ofinter-institutional cooperation towards the highest standards of quality in official statistics. Inaddition, a highly flexible and adaptable SAS® software (PensINE) has been developed by INE forthe actuarial estimation of accrued to date pension obligations/rights in public defined benefitschemes, which brings together a large part of the fruits of this collaboration. Finally, a didacticdissemination of the pension tables results as a tool for analysing the functioning of nationalpensions systems but not as a measure of their future sustainability is a challenging issue that theEuropean Statistical System and other international organizations face nowadays.
  
  Palabras clave / Key words
  Pensions, national accounts, ageing
  
  Documento / Document
Data organisation and process design based on functional modularity for a standard production processE. Esteban, M. Novás, S. Saldaña, D. Salgado, L.
- Doc.
  01/2018
  
  Resumen / Abstract
  We propose to use the principles of functional modularity to cope with the essentialcomplexity of statistical production processes. Moving up in the direction of internationalstatistical production standards (GSBPM and GSIM), data organisation and processdesign under a combination of object-oriented and functional computing paradigms areproposed. The former comprises a standardised key-value pair abstract data model wherekeys are constructed by means of the structural statistical metadata of the productionsystem. The latter makes a profuse usage of the principles of functional modularity(modularity, data abstraction, hierarchy, and layering) to design production steps. Weprovide a proof of concept focusing upon an optimization approach to selective editingapplied to real survey data in standard production conditions at Statistics Spain (INE).Several R packages have been prototyped implementing these ideas. We also sharediverse aspects raising from the practicalities of the implementation.
  
  Palabras clave / Key words
  Production Architecture, Key-value Pair Data Model, Standardisation, Functional
  
  Documento / Document
A modern vision of official statistical productionD. Salgado
- Doc.
  03/2016
  
  Resumen / Abstract
  This work is devoted to defend the claim that the modernisation and industrialisation ofofficial statistical production needs a unified combination of statistics and computerscience in its very principles. We illustrate our vision with concrete proposals undercurrent implementation at Statistics Spain. Following a bottom-up approach we give aprecise formulation of the estimation problem in a finite population, which by usingfunctional modularity principles has allowed us to propose a methodologicalclassification of level-3 production tasks within the Generic Statistical Business ProcessModel. Additionally, in the same spirit we show our attempts to industrialise thestatistical data editing phase by carefully combining rigorous statistical methodologyproposals with a light-weight object-oriented software implementation. Finally, we arguethat the new sources of information for official statistics will underline the need for thisunified combination.
  
  Palabras clave / Key words
  Modernization, Industrialization, Statistical Production, Statistical methodology, Computer Science
  
  Documento / Document
Process metadata development and implementation under the GSBPM v5.0 at Statistics Spain (INE)D. Salgado, A.I. Sánchez-Luengo
- Doc.
  02/2016
  
  Resumen / Abstract
  Statistics Spain (INE) has recently developed and is currently implementing a standard for the documentation of all statistical production processes. This standard is based upon the Generic Statistical Business Process Model (GSBPM) and comprises a third level of sub-processes adapted to our needs. Each sub-process is documented by specifying its inputs, outputs, throughput, tools, documentation, and responsible unit(s). We borrow from computer science general principles such as modularity, abstraction, hierarchy, and layering to cope with the inherent complexity of a statistical production system. Here we offer a general description of the creation of this standard and of its on-going implementation. We include some reflections about the main difficulties towards a modern industrialised statistical production system.
  
  Palabras clave / Key words
  Process metadata, GSBPM, modernization of official statistics
  
  Documento / Document
Iris: Codificador automático internacional de Causas de muerteJesús Carrillo, Mª del Rosario González
- Doc.
  01/2016
  
  Resumen / Abstract
  La Estadística de Defunciones según la causa de muerte es una de las mayores fuentes de información para la investigación epidemiológica y para la toma de decisiones en políticas sanitarias y sociales. La estadística considera la causa básica de defunción. La selección de la causa básica de defunción se basa en las reglas descritas en la Clasificación Internacional de Enfermedades (CIE). Aunque codificadores cualificados realizan la selección de la causa básica, discrepancias en la interpretación de la CIE reducen la homogeneidad de las estadísticas de mortalidad a nivel internacional. El interés por mejorar la calidad de los datos ha llevado a los investigadores a desarrollar sistemas de codificación y selección de la causa básica de defunción. Iris se presenta como un software prometedor, resultado de muchos años de esfuerzo y cooperación internacional, utilizado actualmente por un número creciente de países.
  
  Palabras clave / Key words
  Estadística de Defunciones según la causa de muerte, causa básica de defunción, Clasificación Internacional de Enfermedades, codificador automático, codificación
  
  Documento / Document
Propuesta de cuenta de producción de los hogares en España en 2010. Estimación de la serie 2003-2010Carlos Angulo, Sara Hernández
- Doc.
  01/2015
  
  Resumen / Abstract
  Existen actividades no de mercado realizadas por los hogares que no se tienen en cuenta en la estimación del PIB, como son las relacionadas con la preparación de alimentos, con la limpieza del hogar o con el cuidado de niños y ancianos. En este documento de trabajo se miden y se valoran tales actividades para agregarlas a las cifras de la contabilidad nacional y obtener así una cuenta de producción de los hogares y el PIB extendido con las valoraciones del trabajo doméstico.
  
  Palabras clave / Key words
  Empleo del tiempo, trabajo no remunerado, producción no de mercado de los hogares, cuentas satélite de los hogares, trabajo doméstico, PIB extendido
  
  Documento / Document
Standardising the editing phase at Statistics Spain: a little step beyond EDIMBUSSilvia Rama, David Salgado
- Doc.
  05/2014
  
  Resumen / Abstract
  We propose a slight generalization of the generic EDIMBUS editing and imputationstrategy based on the notion of statistical production function and the inclusion of editingduring data collection therein. Some first consequences are introduced such as theparametrization of the strategy in terms of the amount of cross-sectional informationavailable for the execution of these functions and a minimal set of specification rules forthem (already present in the literature). Also, we pose specific examples of the editingfunction whose goal is the selection of units for interactive editing so as to optimiseresources. The whole proposal fits within the efforts for the modernisation of thestatistical production process conducted at Statistics Spain.
  
  Palabras clave / Key words
  Editing strategy, EDIMBUS strategy, production function
  
  Documento / Document
Application of the optimization approach to selective editing in the Spanish Industrial Turnover Index and Industrial New Orders Received Index SurveyR. López-Ureña, M. Mancebo, S. Rama, D. Salgado
- Doc.
  04/2014
  
  Resumen / Abstract
  We describe in detail the redesign process of the editing and imputation strategy of theSpanish Industrial Turnover Index and Industrial New Orders Received Index survey. Thisprocess incorporates the optimization approach to selective editing in its combinatorialversion, which we show to contain the score function approach for output editing as aparticular case. We also include considerations about editing during data collection and astandardized expression for edits in short-term business statistics. The process embracesfrom the design of the new edits to their implementation in production. As a global result,the rate of selected units for interactive editing (the most resource-consuming directlyimpinging on both costefectiveness and response burden) has been reduced 20percentage points on average without diminishing data quality.
  
  Palabras clave / Key words
  Selective editing, optimization approach, editing and imputation strategy design
  
  Documento / Document
Additional questions to better measure the self-declared professional status and how to link the mismatches produced in previous series through an econometric modelJavier Orche Galindo, Miguel Ángel García Martínez
- Doc.
  03/2014
  
  Resumen / Abstract
  From 2009 onwards, it was decided to include in the Spanish LFS questionnaire some additional questions for workers who self-declared being members of cooperatives, unpaid family workers or self-employed so that the professional status was better measured. Since then, the previously observed mismatch upward in the level on the total number of selfemployed workers was almost completely adjusted. In the new data on professional status, it was also distinguished which of them had changed from self-employment to wage employment due to the supplementary questions. Therefore, after several quarters, it was possible to fit the change in professional status through aneconometric model and a set of significant explanatory variables obtained from the rest of the questionnaire. Finally, we managed to get a good enough model and could be able to set downin the self-employed 2005-2008 series and the corresponding rise (by the same amount) in the wage employment series.
  
  Palabras clave / Key words
  Labour Force Survey, professional status, self-employment, backcasting, logistic model, imputation, goodness of logistic models
  
  Documento / Document
Comparación de los ingresos del trabajo entre la Encuesta de Condiciones de Vida y las fuentes administrativasPilar Vega, José María Méndez
- Doc.
  02/2014
  
  Resumen / Abstract
  En este documento de trabajo se hace un estudio comparativo entre los datos de las rentas del trabajo recogidos mediante entrevista personal en la Encuesta de Condiciones de Vida (ECV) de 2011 y los datos provenientes de fuentes administrativas. Tanto en los ingresos del trabajo por cuenta ajena como por cuenta propia se hace un análisis de los perceptores de ingresos, así como de las diferencias existentes entre los importes de los ingresos que el informante proporciona en la ECV y los que proporcionan las Fuentes Tributarias.
  
  Palabras clave / Key words
  Encuesta de Condiciones de Vida, ingresos del trabajo, fuentes administrativas.
  
  Documento / Document
Otras facetas de la Encuesta de Empleo del Tiempo 2009-2010Esperanza Vivas, Carlos Angulo, Sara Hernández, Raquel del Val
- Doc.
  01/2014
  
  Resumen / Abstract
  Los análisis contenidos en este documento de trabajo abarcan diversos objetivos particulares de la Encuesta de Empleo del Tiempo 2009-2010 que no han tenido cabida en publicaciones anteriores. Así, el primer capítulo describe los lugares donde se desarrolla la actividad humana, el segundo analiza cómo las parejas reparten las responsabilidades del hogar y el tercero proporciona una valoración económica de las actividades productivas no de mercado de los hogares españoles.
  
  Palabras clave / Key words
  Empleo del tiempo, lugar, distribución de responsabilidades del hogar, trabajo no remunerado, cuenta satélite de los hogares
  
  Documento / Document
Alternativas en la construcción de un indicador multidimensional de calidad de vida / Alternatives in the construction of a multidimensional quality of life indicatorAntonio Argüeso, Teresa Escudero, José María Méndez, María José Izquierdo
- Doc.
  01/2013
  
  Resumen / Abstract
  Versión en español / English version
  
  La medición multidimensional de la calidad de vida es uno de los aspectos con mayor futuro dentro de la estadística oficial. Distintas iniciativas internacionales animan a la recopilación de informes sobre esta materia y en particular al desarrollo de indicadores que intenten sintetizar la medición en un único indicador. Se presenta un análisis de la evolución de la calidad de vida en España basada en el estudio de nueve dimensiones usando como fuentes diversas encuestas entre las que destaca la encuesta de Condiciones de Vida. Se proponen además dos formas alternativas de sintetizar esa medición con sendos indicadores globales. Finalmente se analizan brevemente los retos a los que se enfrenta la estadística oficial para la medición de la calidad de vida.
  
  Multidimensional measurement of quality of life is one of the aspects with greater future potential in official statistics. Different international initiatives encourage the compiling of reports on this matter and in particular the development of indicators set out to synthesize measurement in a single indicator. We present an analysis of the trend in the quality of life in Spain based on the study of nine dimensions using as sources various surveys, prominent amongst which is the Survey on Income and Living Conditions (EU- SILC). In addition, two alternative ways of synthesizing that measurement are put forward, each with global indicators. Finally, the challenges official statistics are facing in measuring quality of life are examined briefly.
  
  Palabras clave / Key words
  Indicadores de calidad de vida, medición multidimensional, condiciones de vida / Quality of life indicators, multidimensional measurement, living conditions
  
  Documento / Document
Proyecto para la capitalización del gasto en I+D en los nuevos sistemas de cuentas nacionales: estimación de su impacto sobre el PIB y compilación de una cuenta satélite de I+D / Project for the capitalization of expenditure on R in new systems of national accounts: estimating its impact on GDP and compilation of a satellite account of R Alfredo Cristóbal Cristóbal, Mariano Gómez del Moral, Belén González Olmos
- Doc.
  02/2013
  
  Resumen / Abstract
  Versión en español / English version
  
  La medición de las variables y agregados económicos asociados a la actividad de I+D constituye un reto estadístico, especialmente en lo que se refiere a la integración y el análisis de los datos de las estadísticas básicas de I+D en un marco conceptual que permita relacionarlos con las variables y agregados macroeconómicos fundamentales. Así, es necesaria la elaboración de un instrumento que integre conceptual y económicamente la información básica disponible y que lo haga, además, en un marco contable consistente y comparable a escala internacional, como es el Sistema de Cuentas Nacionales.
  
  The measurement of the variables and economic aggregates associated with R & D is a statistical challenge,especially as it relates to the integration and analysis of data from basic statistics of R & D within a conceptualframework enabling associate variables and key macroeconomic aggregates. Thus, it is necessary to develop atool that integrates conceptual and economically the basic information available and to do so, more over, in aconsistent and comparable accounting framework at international level, such as the System of National Accounts.
  
  Palabras clave / Key words
  Investigación, desarrollo, formación, bruta, capital, cuenta, satélite / Research and development, gross fixed capital formation, satellite account
  
  Documento / Document
Uso de fuentes administrativas para la reducción de carga y costes en las encuestas estructurales de empresas (UFAES) / Use Of Administrative Sources To Reduce StatisticalBurden And Costs In Structural Business Surveys(UFAES)Jorge Saralegui, Cristina González, Ignacio Arbués
- Doc.
  06/2012
  
  Resumen / Abstract
  Versión en español / A reduced english version
  
  El uso de fuentes administrativas con fines estadísticos forma parte de la actividad corriente del Instituto Nacional de Estadística español (INE) en diversas áreas. El proyecto UFAES supone un nuevo salto cualitativo en estas actividades, con objetivos orientados a reducir de manera significativa el tamaño muestral de las dos grandes operaciones estructurales de empresas en el INE.
  
  The use of administrative sources with statistical purposes is part of the current activity of the National Statistics Institute (INE, Spain), in various fields. UFAES project provides a new qualitative impulse to these activities, with objectives oriented to significantly reduce the sample size of the major INE annual structural business surveys.
  
  Palabras clave / Key words
  Uso, datos, fiscales, encuestas, económicas, reducción, costes, carga, estadística, empresas, administrativas, fines, estadísticos / integration, tax, microdata, enterprise, surveys, indirect, estimation, change, enterprise, structural, variables
Implementing a corporate-wide metadata driven production process at INE SpainPedro Revilla, José Luis Maldonado, José Luis Bercebal, Francisco Hernández
- Doc.
  05/2012
  
  Resumen / Abstract
  As other national statistical institutes, INE has started the transition from the numerous stovepipe-like chains of production to more integrated production processes. The Generic Statistical Business Process Model (GSBPM) provides a framework for the development of this goal. This paper describes INE experiences developing this new model, based on a single standardized production line for all surveys, supported by metadata systems, generic and standardized tools and corporative databases.
  
  Palabras clave / Key words
  Process reengineering, Enterprise architecture, European Statistical System
  
  Documento / Document
Implementing a Quality Assurance Framework based on the Code of Practice at the National Statistical Institute of SpainPedro Revilla, Asunción Piñán
- Doc.
  04/2012
  
  Resumen / Abstract
  Quality has always been a constant concern at the National Statistical Institute of Spain (INE). Nevertheless, a more systematic approach has been implemented since the LEG on quality recommendations and especially since the adoption of the Code of Practice. This paper describes INE experiences implementing a Quality Assurance Framework based on the Code of Practice and in the Sponsorship on Quality recommendations.
  
  A quality structure was created, made up of a Quality Unit, a Quality Manager and a Quality Committee. Through this Committee, all INE units are involved in quality, taking decisions that, once approved by the Board of Directors, are adopted throughout the organization. Moreover, implementing a Quality Assurance Framework based on the Code of Practice is an INE project for 2012.
  
  Calculating the indicators of the Barometer of Quality, implementing a reference metadata system including a quality report, implementing a satisfaction survey, and adopting the GSBPM as a good practice are some of the actions put in practice.
  
  Palabras clave / Key words
  Quality assurance framework, Code of Practice, European statistics
  
  Documento / Document
Two greedy algorithms for a binary quadratically constrained linear program in survey data editingDavid Salgado, Ignacio Arbués, María Elisa Esteban
- Doc.
  03/2012
  
  Resumen / Abstract
  We propose a binary quadratically constrained linear program as an approach to selective editing. In a practice-oriented framework and allowing for some overediting whilst strictly fulfilling accuracy restrictions, we propose two greedy algorithms to find feasible suboptimal solutions. Their running times are quartic and cubic, respectively, in the number of sampling units and linear in the number of restrictions. We present computational evidence from several hundreds of instances randomly generated.
  
  Palabras clave / Key words
  Combinatorial optimization, quadratic constraint, linear program, greedy algorithm, selective editing.
  
  Documento / Document
Testing the predictive ability of two classes of modelsIgnacio Arbués, Cristina Casaseca, Ramiro Ledo, Silvia Rama
- Doc.
  02/2012
  
  Resumen / Abstract
  We propose tests for the null that the best model of a class produces as good forecasts as the best model of another one. Forecasts are evaluated using a loss function. Thus, causality can be tested if only the models in one class use a certain input. This is applied to the unemployment/inflation and industrial orders/production relationships. We find causality for the USA, but neither for France nor Spain.
  
  Palabras clave / Key words
  Evaluating forecasts, Loss function, Model selection, Causality, Bootstrap, Monte Carlo.
  
  Documento / Document
Analysis of the calendar effects on the Industry Turnover and New Orders Received IndicesSilvia Rama, Ignacio Arbués, María Mancebo, Luis Andrés de las Mozas, Eva María Vicente
- Doc.
  01/2012
  
  Resumen / Abstract
  Most economic monthly time series contain calendar effects. It is important to remove the calendar variation to allow an effective assessment of the variation due to other factors.
  
  Several methods exist which can adjust for trading-day and holiday effects in monthly economic time series. This paper reviews these methods and shows the procedure for determining the calendar adjustment carried out on the Industrial Turnover and New Orders Received Indices.
  
  Palabras clave / Key words
  Working-day adjustment, dynamic regression, ARIMA, model identification.
  
  Documento / Document
El INE y su producción estadística: una nota histórica sobre los últimos 50 añosMariano Gómez del Moral
- Doc.
  11/2011
  
  Resumen / Abstract
  En este artículo se ofrece la historia del INE en los últimos cincuenta años a través de los principales productos estadísticos que han caracterizado su actividad. La descripción toma como puntos de referencia tres de los hitos políticos y económicos más relevantes de la historia de España en ese periodo, que se vinculan con tres clases de necesidades informativas y tres modos diferentes de abordar la producción de los datos que han permitido la cobertura de las mismas. El artículo proporciona igualmente un detalle de las principales líneas estratégicas que han soportado el quehacer de la institución y que le han permitido situarse en el grupo de cabeza de las oficinas de estadística pública europeas.
  
  Palabras clave / Key words
  Hitos, líneas estratégicas, bien público, marco internacional, planificación
  
  Documento / Document
Modelling irrigation water consumption at the micro data level in the Survey of Production Methods in Agriculture 2009 (Spain)Jorge Saralegui Gil, Fernando Celestino Rey
- Doc.
  10/2011
  
  Resumen / Abstract
  Preliminary studies for the second phase of the Agricultural Census 2009 (Survey ofProduction Methods) recommended not to include in the survey forms the quantity of water consumed ( required by the EU regulation) as an specific question, mainly due to the risk of high measurement errors. Therefore it was decided to launch a project of model assisted estimation in several stages, described in the paper:
  
  I) Theoretical water needs. After several treatments, the theoretical water requirements per crop are estimated based on an agroclimatic model.
  
  II) Adjustments for irrigation efficiency. In this stage the irrigation water needs per crop is imputed according to the irrigation techniques used by the holding.
  
  III) Management efficiency. Final estimation of effective consumption is implemented and adjusted to external sources to take into account the management efficiency of irrigation.
  
  Palabras clave / Key words
  Water used per crop. Theoretical water needs. Evapotranspiration. Irrigation efficiency
  
  Documento / Document
Metodología de estimación de Diplomados en Estadística del Estado en las delegaciones provinciales del INEJulio César Hernández Sánchez, Cristobal Rojas Montoya
- Doc.
  09/2011
  
  Resumen / Abstract
  El objetivo es estimar los diplomados en estadística del estado necesarios en cada una de las delegaciones provinciales del INE. La metodología sugiere identificar diferentes componentes en la estimación. Se estimarán cuatro modelos para los bloques de operaciones económicas, demográficas, bienales y censo electoral y padrón, que se acumulan para obtener la primera predicción. Además, se creará un modelo global y ambas predicciones se combinarán.
  
  Palabras clave / Key words
  DEE, censo electoral, padrón, cargas de trabajo, estimación, delegaciones provinciales del INE
  
  Documento / Document
Integrating administrative data into the LFS data collection. The Spanish experience obtaining the variable INDECIL from administrative sources.Miguel Ángel García Martínez, Javier Orche Galindo
- Doc.
  08/2011
  
  Resumen / Abstract
  Information on the level of wages of the main job is compulsory in the LFS since 2009 (year of reference). Asking for income in household surveys is a sensitive issue that can affect the response rates and the confidence of the respondents. It was decided to obtain information from administrative sources. The Spanish LFS does not ask the personal identification number of the respondents. The solution applied in the Spanish LFS was to incorporate the PIDN (personal identification number) from the register of population matching the information for both, personal and location variables and to use this PIDN to link through the Social Security and Tax databases and incorporate the data on salaries needed to calculate the variable requested in the LFS. A general view of all the processes involved, the difficulties that we had to overcome and the main findings obtained in the preparation of the information are described.
  
  Palabras clave / Key words
  Labour force survey, record linkage, microintegration, combination of administrative data, validation of data sources, best estimation method
  
  Documento / Document
Towards a corporate-wide electronic data collection system at the National Statistical Institute of SpainPedro Revilla, José Luis Maldonado, José Manuel Bercebal
- Doc.
  07/2011
  
  Resumen / Abstract
  Electronic collections present new challenges and opportunities in order to improve editing tasks. They offer the possibility of using built-in edits in electronic questionnaires previously not possible in paper or other modes of data collection. This topic covers all issues relating to methods or strategies about editing of data acquired through electronic data collection (CAPI, CATI, CAWI, etc) and the way the respondents can carry out editing when using electronic questionnaires. Other related topics may include comparisons of editing practices between electronic collections and other collection modes, as well as different problems using multimode data collections. Measuring the respondent burden and the quality and reliability of the responses in order to provide valuable information to other survey processes is another issue of interest. Papers describing editing strategies to improve relationship with respondents or the general editing process are also welcome.
  
  Palabras clave / Key words
  Electronic questionnaries, electronic data reporting, web questionnaries, CAWI, IRIA,
  
  Documento / Document
Sampling coordination of business surveys in the Spanish National Statistics InstituteDolores Lorca, M. Concepción Molina, Gonzalo Parada, Ana Revilla
- Doc.
  06/2011
  
  Resumen / Abstract
  The Spanish NSI works in several alternatives in order to reduce the statistical burden in business surveys. One of them is the use of sampling coordination techniques to reducethe overlap between samples of different surveys.
  
  The use of the same sampling frame for business surveys (the Central Business Register) has allowed to obtain coordinated samples using the Permanent Random Number (PRN)technique. A statistical burden function is defined and used to coordinate the samples obtained each year.
  
  Palabras clave / Key words
  Statistical burden, sampling coordination, business survey
  
  Documento / Document
Multivariate Wiener-Kolmogorov Filtering by Polynomial MethodsFélix Aparicio-Pérez
- Doc.
  05/2011
  
  Resumen / Abstract
  The exact computation of a general multivariate Wiener-Kolmogorov filter is usually done in state-space form. This paper develops a new method to solve the problem within the polynomial framework. To do so, several matrix polynomial techniques are used. To obtain the initial values, some new techniques are also developed. Finally, some extensions and applications are outlined.
  
  Palabras clave / Key words
  Wiener-Kolgomorov filter, polynomial matrices
  
  Documento / Document
Exploiting auxiliary information: selective editing as a combinatorial optimization problemDavid Salgado
- Doc.
  04/2011
  
  Resumen / Abstract
  We formulate selective editing as a combinatorial optimization problem whose solution establishes which sampled units contain influential errors and thus must undergo interactive editing within a generic editing and imputation strategy. This optimization problem arises naturally from considerations on editing resources savings and estimates accuracy control. Cross-sectional auxiliary information is taken into account through linear mixed models assisting the construction of the problem's feasibility region. We provide a general algorithm for the univariate version of this problem, i.e. for editing one single variable. By applying this proposal to each questionnaire variable we illustrate its use upon the Spanish industrial turnover index and industrial new orders received index surveys. A reduction of interactive editing with a controllably increase of estimates error is observed.
  
  Palabras clave / Key words
  Selective editing, combinatorial optimization, auxiliary information, linear mixed models
  
  Documento / Document
Study of variance estimation methods in the Spanish Labour Force Survey (EPA)Gerardo Azor Martínez, Juan V. Jiménez Llorente, Carlos Pérez Arriero, Juana Porras Puga
- Doc.
  03/2011
  
  Resumen / Abstract
  The aim of this paper is to compare different methods for calculating sampling errors in the Spanish Labour Force Survey (EPA). Half sample replication (HSR) is the method currently employed to this end. We compare its results with those obtained with two other more recent techniques, standard delete-one jackknife and Rao-Wu-Yue bootstrap. The paper begins with a brief description of the EPA methodology, and goes on with a theoretical presentation of the above mentioned methods, followed by the coefficient of variation (CV) calculated for the estimates of the most important EPA variables in 2009. Finally, we present a more detailed study for the autonomous community of Galicia. In this NUTS2 the sample has been enlarged in the third quarter of 2009, and this fact allows us to study the changes in the estimates of the variance, in relation to the change in sample size.
  
  Palabras clave / Key words
  Sampling errors, half sample replication, jackknife, bootstrap
  
  Documento / Document
On the error of backcast estimates using conversion matrices under a change of classificationIgnacio Arbués, Natalia López
- Doc.
  02/2011
  
  Resumen / Abstract
  The classifications used by statistical agencies are sometimes updated. Hence, for the sake of comparability, it is necessary to estimate data from past periods according to the new classification. A frequently used method to calculate the estimates is through the use of Conversion Matrices. We present a theoretical analysis of this method and show with a practical example that it is possible to obtain useful estimates of the error.
  
  Palabras clave / Key words
  Change of Classification, Backcasting
  
  Documento / Document
Linking data from administrative records and the Living Conditions SurveyJosé María Méndez Martín, Pilar Vega Vicente
- Doc.
  01/2011
  
  Resumen / Abstract
  The Encuesta de Condiciones de Vida (Living Conditions Survey, LCS) is an annual survey compiled by the Instituto Nacional de Estadística (Spanish National Statistics Institute, INE). Access to administrative records offers a good opportunity to improve the quality of the relevant data and allow the use of a more efficient collection method. This paper offers a comparative analysis of different income components by linking the survey data with available data from the Spanish Tax Agency or Social Security system.
  
  Palabras clave / Key words
  Living Conditions Survey, administrative records, household income
  
  Documento / Document
Monthly Demographic Now Cast: monthly estimates of migration flows in SpainMiguel Ángel Martínez Vidal, Sixto Muriel de la Riva
- Doc.
  08/2010
  
  Resumen / Abstract
  The Monthly Demographic Now Cast has the goal of covering a traditional lack of information about the demographic juncture. Besides, it is a very innovative statistical action, which introduces a new monthly basis in demographic analysis and shows a high level of accuracy immediately after the reference period. Particularly, its results are decisive in providing advanced estimates of current migration flows for the calculation of Spain¿s Population Now Cast.
  
  Palabras clave / Key words
  Monthly demographic estimations; immigration; emigration; expanding coefficient; registered flows
  
  Documento / Document
Determining the MSE-optimal cross section to forecastIgnacio Arbués
- Doc.
  07/2010
  
  Resumen / Abstract
  We address the problem of which subset of time series to select among a given set in order to forecast another series. The forecasts are evaluated in terms of Mean Squared Error. We propose a family of criteria for which weak and strong consistency results are proved. The criteria are compared to some well-known hypothesis tests by means of Monte Carlo experimentation and a real-data example.
  
  Palabras clave / Key words
  forecasting, model selection, VARMA models
  
  Documento / Document
INE-Spain strategy on population estimates and projections: facing the challenge of the statistical measure of populationMiguel Ángel Martínez Vidal, Sixto Muriel de la Riva
- Doc.
  06/2010
  
  Resumen / Abstract
  National Statistics Institute of Spain presents new actions focused on improving the available statistical sources of demographic information and providing accurate population figures and punctual, detailed and consistent information on current demographic evolution, in a context of general concerns about the current and future evolution of the population pyramid.
  
  Palabras clave / Key words
  Demographic flows; population now cast; demographic projections
  
  Documento / Document
Effects of rotation groups, interviewing modes and interviewers on the LFS estimatesFlorentina Álvarez, Juana Porras
- Doc.
  05/2010
  
  Resumen / Abstract
  This paper examines the influence of three factors (rotation groups, method of interview and interviewer effect) on the main estimations in the Labour Force Survey (LFS), by performing probit and homogeneity analysis. It also shows that the influence of the interview method is partially due to the different representation of foreign people in the CATI and CAPI samples. Finally, it highlights the importance of a correct identification of the bias sources and outlines the future plans to improve the standardization in the Spanish LFS.
  
  Palabras clave / Key words
  Bias Sources, LFS, Rotation Groups, CAPI y CATI, Interviewer effects
  
  Documento / Document
Towards advanced methods for computing life tablesSixto Muriel de la Riva, Margarita Cantalapiedra Malaguilla, Federico López Carrión
- Doc.
  04/2010
  
  Resumen / Abstract
  INE Spain puts compiled data and literature in a series of tests and comparative analyses in order to select the best-suited methodologies for computing life tables, trying to reach an optimal use of the available data on deaths, better international comparability of mortality indicators and an enhanced approach to measuring mortality over small territorial areas.
  
  Palabras clave / Key words
  Mortality tables; risks of dying; regional mortality tables; different methods to build mortality tables
  
  Documento / Document
Changes of classification in continuous statistics: Calculation of retrospective series. Application to the Quarterly Labour Cost SurveyAmelia Fresneda Pacheco, María Ramos Charbonnier
- Doc.
  03/2010
  
  Resumen / Abstract
  A change in the system of classifications of a statistic produces new estimations that are not comparable to the previous ones. Frequently it is desirable to provide temporal series of the results with a temporal horizon enough to allow its analysis, and that's why it is necessary to elaborate retrospective series in the new system. This document tries to explain the procedure applied to adjust the estimations about Labour Costs in Spain to the new National Classification of Economic Activities (CNAE09), that is the spanish version of the EU Classification of Economic Activities, NACE rev 2
  
  Palabras clave / Key words
  Backcasting, Calibration, Post-stratified estimator, Stratified Randon Sampling, National Classification of Economic Activities, Quarterly Labour Cost Survey
  
  Documento / Document
A Class of stochastic optimization problems with application to selective data editingIgnacio Arbués, Margarita González, Pedro Revilla
- Doc.
  02/2010
  
  Resumen / Abstract
  We present a new class of stochastic optimization problems where the solutions belong to an infinite-dimensional space of random variables. We prove existence of the solutions and show that under convexity conditions, a duality method can be used. The search for a good selective editing strategy is stated as a problem in this class, setting the expected workload as the objective function to minimize and imposing quality constraints.
  
  Palabras clave / Key words
  Stochasting Programming, Optimization in Banach Spaces, Selective Editing, Score Function
  
  Documento / Document
Elaboración de un indicador sintético de medio ambiente. Resultados derivados de la Encuesta de Hogares y Medio Ambiente 2008Carmen Teijeiro Breijo, Carlos Angulo Martín
- Doc.
  01/2010
  
  Resumen / Abstract
  El objetivo de la elaboración del indicador de Medio Ambiente es posicionar a los hogares y personas en función de su grado de sensibilización con los problemas medioambientales, teniendo en cuenta tanto su comportamiento colectivo como individual. El indicador propuesto pretende sintetizar de modo más manejable la información multidimensional recogida en la Encuesta de Hogares y Medio Ambiente 2008, que ha realizado el INE y permite establecer comparaciones por características socioeconómicas y territoriales.
  
  Palabras clave / Key words
  Indicador sintético, análisis multivariante, medio ambiente
  
  Documento / Document