Opendatabay APP

Finnish Semantic Concreteness Dataset

Synthetic Biology & Genetic Engineering

Tags and Keywords

Literature

Nlp

Finnish

Concreteness

Words

Poetry

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Finnish Semantic Concreteness Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a list of Finnish words along with their concreteness values, ranging from 1 (highly abstract) to 5 (very concrete). Its primary purpose is to facilitate poem generation in Finnish and support research in Natural Language Generation (NLG). The data has been specifically produced to assist in generating Finnish poetry that considers aesthetic and framing elements, as demonstrated in a notable 2019 publication on the subject.

Columns

  • word: Represents a Finnish word, or multiple words, reflecting how the data has been translated.
  • concreteness: This numerical value indicates the average concreteness of the top translations of the 'word'. A value of 1 signifies abstractness, while a value of 5 indicates high concreteness.

Distribution

The dataset is typically provided in a CSV data file format. It contains 35,780 unique words and a total of approximately 35,805 records. The concreteness values span a range from 1.07 to 5.00. The distribution of concreteness values across the dataset varies, with significant counts across all ranges. For example, there are 2,688 records with concreteness values between 4.80 and 5.00.

Usage

This dataset is ideal for:
  • Developing and evaluating Finnish poetry generation systems.
  • Conducting research in Natural Language Processing (NLP), particularly in areas related to semantic properties of words.
  • Analysing and generating content for Finnish literature.
  • Psycholinguistic studies on word perception and concreteness in the Finnish language.

Coverage

The dataset offers global coverage for Finnish words. It was produced to support a publication from 2019 and was listed on a marketplace on 17/06/2025, with a version of 1.0. Its scope is focused on the Finnish language.

License

CC-BY-NC

Who Can Use It

  • AI and LLM developers creating applications that interact with the Finnish language.
  • Researchers in computational linguistics, NLP, and psycholinguistics focusing on Finnish.
  • Data scientists and linguists interested in word semantics and text analysis in Finnish.
  • Creative technologists and artists working on generative poetry or text in Finnish.

Dataset Name Suggestions

  • Finnish Concreteness Lexicon
  • Finnish Word Concreteness Values
  • Finnish Poetry Generation Data
  • Finnish Semantic Concreteness Dataset

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free