Data Ethics and Privacy

The FashionBrain Project recognises the importance of data ethics and privacy issues. The Consortium has formed an Ethics Committee, consisting of independent ethics advisers and experts in the fields of Image Recognition, Crowdsourcing, Data Science, Web Systems and Media and Communication, supported by the University of Sheffield Research Ethics Committee. The Committee has developed a high-level framework to help resolve ethical concerns with the objective of identifying the problems and parties involved to implementing the course of action and monitoring their progress while also minimising any adverse effects on the FashionBrain’s operating platform.

Along with the Consent Manager, the committee has recommended that this privacy page be added to explain what types of data are being being collected and what the data is subsequently being used for. The project is demonstrating strong ethical principles and public reassurance by introducing transparency to the communication process.

What is the project’s goal?

The FashionBrain project aims at combining data from different sources to support different fashion industry players by predicting upcoming fashion trends from social media as well as by providing personalized recommendations and advanced fashion item search to customers.
What data sources does the project use?
The FashionBrain project may collect personal information from online social media and, overall, will be using three types of data:

Proprietary data owned by an organization (and most likely one of the partners) which is restricted by institutional policy and regulation and has been ethically collected. All datasets containing personal data will be anonymised before being provided to the consortium.
Previously collected data through previous research campaigns, logs from proprietary search engines or other systems in which the organization has granted permission to use, or other researchers may provide data sets. In all cases, the supervisor or host will be responsible for ensuring that the data was ethically collected.
Newly collected data that may come from a) self-reported data from interviews or questionnaires, b) unobtrusive observation/crowdsourcing (in person and/or via computer) using log files, and screen capture video, c) metadata, text and images from articles, blogs, and social media posts.
Prior to any user/experimental/interview sessions, all potential participants will be fully informed about the objectives of the research, the procedures to be used, and the data to be collected; all will have a right to cease participating at any time, and at the end, the right to remove any data they have contributed.

Unless stated otherwise, all participation will be anonymised and any reporting, which will be done in the aggregate, will ensure that the anonymity is protected. All participants will be treated respectfully and with dignity, acknowledging the service and commitment that participants provide to support research. Practically, this means that all participants will be presented with an information sheet that describes the objectives of the research, outlines the method to be deployed, the data to be collected, and the subsequent use of the data. The rights of the participants with respect to the collected data and their participation will be outlined in ordinary language in a consent form to be signed by the participant and the researcher. The exact details may vary from organization to organization but the precise sentiment is respected and used by all researchers.
In the case of Instagram images and metadata, explicit consent is not obtained, and additional measures have been taken, as explained in the remainder of this document.

What does the project do with the collected data?

We strive to maintain the highest level of anonymity of all individual users in our work, only keeping data which is essential to the project’s objectives. Once collected data is filtered to remove spam and irrelevant content, aggregated statistics and models will be produced. We will destroy all personal data if it is no longer to be used for the project’s purposes. In the case of Instagram posts, metadata about the images and information of their classification will be produced.

How does the project treat public data?

In some cases, the FashionBrain project will be collecting data from publicly available sources, namely Instagram posts. This means in principle all relevant authorisation and consent have been provided by the owners of data. The project will only access social network content from the official APIs, users on these networks have given consent to the network to share their data with third parties.
However, the consortium recognises that the interpretation of the privacy laws vary across the EU and that social network data which is public might be considered private even if the user has given consent to the social network to share their data. This legal grey area is a concern but it is not practical for the project to get a double opt-in from social networks users, as this would require the users to voluntarily opt-in to FashionBrain data collection or for FashionBrain to contact every single Instagram user requesting consent to use their data, no similar analytics service performs this double opt-in.
To address this issue FashionBrain has provided a Consent Manager on the project website, which allows the public to request a blind opt-out from data collection. If a participant voluntarily gives access to their social network account ID number, either via the Consent Manager or by email, they are sending only their account ID to the FashionBrain administrator. From the date of receipt, we will destroy all request communications. For a participants content to be removed from FashionBrain activities, the account ID is added to a static blacklist table, all incoming account matching this blacklist will be automatically discarded.
The Consent Manager can be found at ./consent_manager.

How is data stored in the FashionBrain Project?

The project has implemented an ever-evolving Data Management Plan (DMP) which details the management for all of the data that will be collected, processed or generated by the project during its lifespan.

Briefly, data collected by FashionBrain projects will be stored and secured in password protected digital storage that meets with the strict requirements of each institution and are compliant with H2020 regulations. This will be done to protect participants’ identities, and to ensure that the unique data collected by the project is secure.

What data will be shared by the FashionBrain Project?

For any data that subsequently becomes part of a sharable data set (with the permission of participants as outlined in the formal consent), the data will be anonymised to protect participants’ identity before being released/transfered. Video, images, and audio data may never be released under such conditions unless anonymity can be guaranteed.
In all cases, any restrictions related to existing or newly collected data will be included with the shared data for future use.

Transmission of content to 3rd party services

The FashionBrain project will not use any third-party service to process the data.