Package 'handwriter' reference manual

Title:	Handwriting Analysis in R
Description:	Perform statistical writership analysis of scanned handwritten documents. Webpage provided at: <https://github.com/CSAFE-ISU/handwriter>.
Authors:	Iowa State University of Science and Technology on behalf of its Center for Statistics and Applications in Forensic Evidence [aut, cph, fnd], Nick Berry [aut], Stephanie Reinders [aut, cre], James Taylor [aut], Felix Baez-Santiago [ctb], Jon González [ctb]
Maintainer:	Stephanie Reinders <[email protected]>
License:	GPL-3
Version:	3.2.4.9000
Built:	2025-02-08 02:47:56 UTC
Source:	https://github.com/csafe-isu/handwriter

About Varialbe

Description

about_variable() returns information about the model variable.

Usage

about_variable(variable, model)
about_variable(variable, model)

Arguments

`variable`	A variable in the fitted model output by `fit_model()`
`model`	A fitted model created by `fit_model()`

Value

Text that explains the variable

Examples

about_variable(
  variable = "mu[1,2]",
  model = example_model
)

about_variable(
  variable = "mu[1,2]",
  model = example_model
)

addToFeatures

Description

addToFeatures

Usage

addToFeatures(FeatureSet, LetterList, vectorDims)
addToFeatures(FeatureSet, LetterList, vectorDims)

Arguments

`FeatureSet`	The current list of features that have been calculated
`LetterList`	List of all letters and their information
`vectorDims`	Vectors with image Dims

Value

A list consisting of current features calculated in FeatureSet as well as measures of compactness, loop count, and loop dimensions

analyze_questioned_documents() estimates the posterior probability of writership for the questioned documents using Markov Chain Monte Carlo (MCMC) draws from a hierarchical model created with fit_model().

Usage

analyze_questioned_documents(
  main_dir,
  questioned_docs,
  model,
  num_cores,
  writer_indices,
  doc_indices
)
analyze_questioned_documents(
  main_dir,
  questioned_docs,
  model,
  num_cores,
  writer_indices,
  doc_indices
)

Arguments

`main_dir`	A directory that contains a cluster template created by `make_clustering_template()`
`questioned_docs`	A directory containing questioned documents
`model`	A fitted model created by `fit_model()`
`num_cores`	An integer number of cores to use for parallel processing with the `doParallel` package.
`writer_indices`	A vector of start and stop characters for writer IDs in file names
`doc_indices`	A vector of start and stop characters for document names in file names

Value

A list of likelihoods, votes, and posterior probabilities of writership for each questioned document.

Examples

## Not run: 
main_dir <- "/path/to/main_dir"
questioned_docs <- "/path/to/questioned_images"
analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = questioned_docs,
  model = model,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)
analysis$posterior_probabilities

## End(Not run)

## Not run: 
main_dir <- "/path/to/main_dir"
questioned_docs <- "/path/to/questioned_images"
analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = questioned_docs,
  model = model,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)
analysis$posterior_probabilities

## End(Not run)

Calculate Accuracy

Description

Fit a model with fit_model() and calculate posterior probabilities of writership with analyze_questioned_documents() of a set of test documents where the ground truth is known. Then use calculate_accuracy() to measure the accuracy of the fitted model on the test documents. Accuracy is calculated as the average posterior probability assigned to the true writer.

Usage

calculate_accuracy(analysis)
calculate_accuracy(analysis)

Arguments

analysis

Writership analysis output by analyze_questioned_documents

Value

The model's accuracy on the test set as a number

Examples

# calculate the accuracy for example analysis performed on test documents and a model with 1 chain
calculate_accuracy(example_analysis)

## Not run: 
main_dir <- "/path/to/main_dir"
test_images_dir <- "/path/to/test_images"
analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = test_images_dir,
  model = model,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)
calculate_accuracy(analysis)

## End(Not run)

# calculate the accuracy for example analysis performed on test documents and a model with 1 chain
calculate_accuracy(example_analysis)

## Not run: 
main_dir <- "/path/to/main_dir"
test_images_dir <- "/path/to/test_images"
analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = test_images_dir,
  model = model,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)
calculate_accuracy(analysis)

## End(Not run)

cleanBinaryImage

Description

Removes alpha channel from png image.

Usage

cleanBinaryImage(img)
cleanBinaryImage(img)

Arguments

img

A matrix of 1s and 0s.

Value

png image with the alpha channel removed

Cursive written word: csafe

Description

Cursive written word: csafe

Usage

csafe
csafe

Format

Binary image matrix. 111 rows and 410 columns.

Examples

csafe_document <- list()
csafe_document$image <- csafe
plotImage(csafe_document)
csafe_document$thin <- thinImage(csafe_document$image)
plotImageThinned(csafe_document)
csafe_processList <- processHandwriting(csafe_document$thin, dim(csafe_document$image))
csafe_document <- list()
csafe_document$image <- csafe
plotImage(csafe_document)
csafe_document$thin <- thinImage(csafe_document$image)
plotImageThinned(csafe_document)
csafe_processList <- processHandwriting(csafe_document$thin, dim(csafe_document$image))

Drop Burn-In

Description

drop_burnin() removes the burn-in from the Markov Chain Monte Carlo (MCMC) draws.

Usage

drop_burnin(model, burn_in)
drop_burnin(model, burn_in)

Arguments

`model`	A list of MCMC draws from a model fit with `fit_model()`.
`burn_in`	An integer number of starting iterations to drop from each MCMC chain.

Value

A list of data frames of MCMC draws with burn-in dropped.

Examples

model <- drop_burnin(model = example_model, burn_in = 25)
plot_trace(variable = "mu[1,2]", model = example_model)

model <- drop_burnin(model = example_model, burn_in = 25)
plot_trace(variable = "mu[1,2]", model = example_model)

Example of writership analysis

Description

Example of writership analysis

Usage

example_analysis
example_analysis

Format

The results of analyze_questioned_documents() stored in a named list with 5 items:

graph_measurements: A data frame of that shows the writer, document name, cluster assignment, slope, principle component rotation angle, and wrapped principle component rotation angle for each training graph in each questioned documents.
cluster_fill_counts: A data frame of the cluster fill counts for each questioned document.
likelihoods: A list of data frames where each data frame contains the likelihoods for a questioned document for each MCMC iteration.
votes: A list of vote tallies for each questioned document.
posterior_probabilites: A list of posterior probabilities of writership for each questioned document and each known writer in the closed set used to train the hierarchical model.

Examples

plot_cluster_fill_counts(formatted_data = example_analysis)
plot_posterior_probabilities(analysis = example_analysis)

plot_cluster_fill_counts(formatted_data = example_analysis)
plot_posterior_probabilities(analysis = example_analysis)

Example cluster template

Description

An example cluster template created with make_clustering_template(). The cluster template was created from handwriting samples "w0016_s01_pLND_r01.png", "w0080_s01_pLND_r01.png", "w0124_s01_pLND_r01.png", "w0138_s01_pLND_r01.png", and "w0299_s01_pLND_r01.png" from the CSAFE Handwriting Database. The template has K=5 clusters.

Usage

example_cluster_template
example_cluster_template

Format

A list containing a single cluster template created by make_clustering_template(). The cluster template was created by sorting a random sample of 1000 graphs from 10 training documents into 10 clusters with a K-means algorithm. The cluster template is a named list with 16 items:

centers_seed: An integer for the random number generator.
cluster: A vector of cluster assignments for each graph used to create the cluster template.
centers: The final cluster centers produced by the K-Means algorithm.
K: The number of clusters to build (10) with the K-means algorithm.
n: The number of training graphs to use (1000) in the K-means algorithm.
docnames: A vector that lists the training document from which each graph originated.
writers: A vector that lists the writer of each graph.
iters: The maximum number of iterations for the K-means algorithm (3).
changes: A vector of the number of graphs that changed clusters on each iteration of the K-means algorithm.
outlierCutoff: A vector of the outlier cutoff values calculated on each iteration of the K-means algorithm.
stop_reason: The reason the K-means algorithm terminated.
wcd: A matrix of the within cluster distances on each iteration of the K-means algorithm. More specifically, the distance between each graph and the center of the cluster to which it was assigned on each iteration.
wcss: A vector of the within-cluster sum of squares on each iteration of the K-means algorithm.

Examples

# view cluster fill counts for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_counts(template_data, facet = TRUE)

# view cluster fill counts for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_counts(template_data, facet = TRUE)

Example of a hierarchical model

Description

Example of a hierarchical model

Usage

example_model
example_model

Format

A hierarchical model created by fit_model with a single chain of 100 MCMC iterations. It is a named list of 4 objects:

graph_measurements: A data frame of model training data that shows the writer, document name, cluster assignment, slope, principle component rotation angle, and wrapped principle component rotation angle for each training graph.
cluster_fill_counts: A data frame of the cluster fill counts for each model training document.
rjags_data: The model training information from graph_measurements and cluster_fill_counts formatted for RJAGS.
fitted_model: A model fit using the rjags_data and the RJAGS and coda packages. It is an MCMC list that contains a single MCMC object.

Examples

# convert to a data frame and view all variable names
df <- as.data.frame(coda::as.mcmc(example_model$fitted_model))
colnames(df)

# view a trace plot
plot_trace(variable = "mu[1,1]", model = example_model)

# drop the first 25 MCMC iterations for burn-in
model <- drop_burnin(model = example_model, burn_in = 25)

## Not run: 
# analyze questioned documents
main_dir <- /path/to/main_dir
questioned_docs <- /path/to/questioned_documents_directory
analysis <- analyze_questioned_documents(
   main_dir = main_dir,
   questioned_docs = questioned_docs
   model = example_model
   num_cores = 2
)
analysis$posterior_probabilities

## End(Not run)

# convert to a data frame and view all variable names
df <- as.data.frame(coda::as.mcmc(example_model$fitted_model))
colnames(df)

# view a trace plot
plot_trace(variable = "mu[1,1]", model = example_model)

# drop the first 25 MCMC iterations for burn-in
model <- drop_burnin(model = example_model, burn_in = 25)

## Not run: 
# analyze questioned documents
main_dir <- /path/to/main_dir
questioned_docs <- /path/to/questioned_documents_directory
analysis <- analyze_questioned_documents(
   main_dir = main_dir,
   questioned_docs = questioned_docs
   model = example_model
   num_cores = 2
)
analysis$posterior_probabilities

## End(Not run)

Extract Graphs

Description

'r lifecycle::badge("superseded")'

Usage

extractGraphs(source_folder = getwd(), save_folder = getwd())
extractGraphs(source_folder = getwd(), save_folder = getwd())

Arguments

`source_folder`	path to folder containing .png images
`save_folder`	path to folder where graphs are saved to

Details

Development on 'extractGraphs()' is complete. We recommend using 'process_batch_dir()' instead.

Extracts graphs from .png images and saves each by their respective writer.

Value

saves graphs in an rds file

Examples

## Not run: 
sof <- "path to folder containing .png images"
saf <- "path to folder where graphs will be saved to"
extractGraphs(sof, saf)

## End(Not run)
## Not run: 
sof <- "path to folder containing .png images"
saf <- "path to folder where graphs will be saved to"
extractGraphs(sof, saf)

## End(Not run)

Fit Model

Description

fit_model() fits a Bayesian hierarchical model to the model training data in model_docs and draws samples from the model as Markov Chain Monte Carlo (MCMC) estimates.

Usage

fit_model(
  main_dir,
  model_docs,
  num_iters,
  num_chains = 1,
  num_cores,
  writer_indices,
  doc_indices,
  a = 2,
  b = 0.25,
  c = 2,
  d = 2,
  e = 0.5
)
fit_model(
  main_dir,
  model_docs,
  num_iters,
  num_chains = 1,
  num_cores,
  writer_indices,
  doc_indices,
  a = 2,
  b = 0.25,
  c = 2,
  d = 2,
  e = 0.5
)

Arguments

`main_dir`	A directory that contains a cluster template created by `make_clustering_template()`
`model_docs`	A directory containing model training documents
`num_iters`	An integer number of iterations of MCMC.
`num_chains`	An integer number of chains to use.
`num_cores`	An integer number of cores to use for parallel processing clustering assignments. The model fitting is not done in parallel.
`writer_indices`	A vector of the start and stop character of the writer ID in the model training file names. E.g., if the file names are writer0195_doc1, writer0210_doc1, writer0033_doc1 then writer_indices is 'c(7,10)'.
`doc_indices`	A vector of the start and stop character of the "document name" in the model training file names. This is used to distinguish between two documents written by the same writer. E.g., if the file names are writer0195_doc1, writer0195_doc2, writer0033_doc1, writer0033_doc2 then doc_indices are 'c(12,15)'.
`a`	The shape parameter for the Gamma distribution in the hierarchical model
`b`	The rate parameter for the Gamma distribution in the hierarchical model
`c`	The first shape parameter for the Beta distribution in the hierarchical model
`d`	The second shape parameter for the Beta distribution in the hierarchical model
`e`	The scale parameter for the hyper prior for mu in the hierarchical model

Value

A list of training data used to fit the model and the fitted model

Examples

## Not run: 
main_dir <- "/path/to/main_dir"
model_docs <- "path/to/model_training_docs"
questioned_docs <- "path/to/questioned_docs"

model <- fit_model(
  main_dir = main_dir,
  model_docs = model_docs,
  num_iters = 100,
  num_chains = 1,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)

model <- drop_burnin(model = model, burn_in = 25)

analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = questioned_docs,
  model = model,
  num_cores = 2
)
analysis$posterior_probabilities

## End(Not run)

## Not run: 
main_dir <- "/path/to/main_dir"
model_docs <- "path/to/model_training_docs"
questioned_docs <- "path/to/questioned_docs"

model <- fit_model(
  main_dir = main_dir,
  model_docs = model_docs,
  num_iters = 100,
  num_chains = 1,
  num_cores = 2,
  writer_indices = c(2, 5),
  doc_indices = c(7, 18)
)

model <- drop_burnin(model = model, burn_in = 25)

analysis <- analyze_questioned_documents(
  main_dir = main_dir,
  questioned_docs = questioned_docs,
  model = model,
  num_cores = 2
)
analysis$posterior_probabilities

## End(Not run)

Format Template Data

Description

format_template_data() formats the template data for use with plot_cluster_fill_counts(). The output is a list that contains a data frame called cluster_fill_counts.

Usage

format_template_data(template)
format_template_data(template)

Arguments

template

A single cluster template created by make_clustering_template()

Value

List that contains the cluster fill counts

Examples

template_data <- format_template_data(template = example_cluster_template)
plot_cluster_fill_counts(formatted_data = template_data, facet = TRUE)

template_data <- format_template_data(template = example_cluster_template)
plot_cluster_fill_counts(formatted_data = template_data, facet = TRUE)

Get Cluster Fill Counts

Description

get_cluster_fill_counts() creates a data frame that shows the number of graphs in each cluster for each input document.

Usage

get_cluster_fill_counts(df)
get_cluster_fill_counts(df)

Arguments

df

A data frame of cluster assignments from get_clusters_batch. The data frame has columns docname and cluster. Each row corresponds to a graph and lists the document from which the graph was obtained and the cluster to which that graph is assigned. Optionally, the data frame might also have writer and doc columns. If present, writer lists the writer ID of each document and doc is an identifier to distinguish between different documents from the same writer.

Value

A dataframe of cluster fill counts for each document in the input data frame.

Examples

docname <- c(rep("doc1", 20), rep("doc2", 20), rep("doc3", 20))
writer <- c(rep(1, 20), rep(2, 20), rep(3, 20))
doc <- c(rep(1, 20), rep(2, 20), rep(3, 20))
cluster <- sample(3, 60, replace = TRUE)
df <- data.frame(docname, writer, doc, cluster)
get_cluster_fill_counts(df)

docname <- c(rep("doc1", 20), rep("doc2", 20), rep("doc3", 20))
writer <- c(rep(1, 20), rep(2, 20), rep(3, 20))
doc <- c(rep(1, 20), rep(2, 20), rep(3, 20))
cluster <- sample(3, 60, replace = TRUE)
df <- data.frame(docname, writer, doc, cluster)
get_cluster_fill_counts(df)

Get Cluster Fill Rates

Description

get_cluster_fill_rates() creates a data frame that shows the proportion of graphs assigned to each cluster in a cluster template.

Usage

get_cluster_fill_rates(df)
get_cluster_fill_rates(df)

Arguments

df

Value

A data frame of cluster fill rates.

Examples

docname <- c(rep("doc1", 20), rep("doc2", 20), rep("doc3", 20))
writer <- c(rep(1, 20), rep(2, 20), rep(3, 20))
doc <- c(rep(1, 20), rep(2, 20), rep(3, 20))
cluster <- sample(3, 60, replace = TRUE)
df <- data.frame(docname, writer, doc, cluster)
rates <- get_cluster_fill_rates(df)

docname <- c(rep("doc1", 20), rep("doc2", 20), rep("doc3", 20))
writer <- c(rep(1, 20), rep(2, 20), rep(3, 20))
doc <- c(rep(1, 20), rep(2, 20), rep(3, 20))
cluster <- sample(3, 60, replace = TRUE)
df <- data.frame(docname, writer, doc, cluster)
rates <- get_cluster_fill_rates(df)

get_clusters_batch

Description

get_clusters_batch

Usage

get_clusters_batch(
  template,
  input_dir,
  output_dir,
  writer_indices = NULL,
  doc_indices = NULL,
  num_cores = 1,
  save_master_file = FALSE
)
get_clusters_batch(
  template,
  input_dir,
  output_dir,
  writer_indices = NULL,
  doc_indices = NULL,
  num_cores = 1,
  save_master_file = FALSE
)

Arguments

`template`	A cluster template created with `make_clustering_template`
`input_dir`	A directory containing graphs created with `process_batch_dir`
`output_dir`	Output directory for cluster assignments
`writer_indices`	Optional. A Vector of start and end indices for the writer id in the graph file names.
`doc_indices`	Optional. Vector of start and end indices for the document id in the graph file names.
`num_cores`	Integer number of cores to use for parallel processing
`save_master_file`	TRUE or FALSE. If TRUE, a master file named 'all_clusters.rds' containing the cluster assignments for all documents in the input directory will be saved to the output directory. If FASLE, a master file will not be saved, but the individual files for each document in the input directory will still be saved to the output directory.

Value

A list of cluster assignments

Examples

## Not run: 
template <- readRDS('path/to/template.rds')
get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(2,5), doc_indices=c(7,18), num_cores=1)

get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(1,4), doc_indices=c(5,10), num_cores=5)

## End(Not run)

## Not run: 
template <- readRDS('path/to/template.rds')
get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(2,5), doc_indices=c(7,18), num_cores=1)

get_clusters_batch(template=template, input_dir='path/to/dir', output_dir='path/to/dir',
writer_indices=c(1,4), doc_indices=c(5,10), num_cores=5)

## End(Not run)

Get Credible Intervals

Description

In a model created with fit_model() the pi parameters are the estimate of the true cluster fill count for a particular writer and cluster. The function get_credible_intervals() calculates the credible intervals of the pi parameters for each writer in the model.

Usage

get_credible_intervals(model, interval_min = 0.05, interval_max = 0.95)
get_credible_intervals(model, interval_min = 0.05, interval_max = 0.95)

Arguments

`model`	A model output by `fit_model()`
`interval_min`	The lower bound for the credible interval. The number must be between 0 and 1.
`interval_max`	The upper bound for the credible interval. The number must be greater than `interval_min` and must be less than 1.

Value

A list of data frames. Each data frame lists the credible intervals for a single writer.

Examples

get_credible_intervals(model=example_model)
get_credible_intervals(model=example_model, interval_min=0.05, interval_max=0.95)

get_credible_intervals(model=example_model)
get_credible_intervals(model=example_model, interval_min=0.05, interval_max=0.95)

Get Posterior Probabilities

Description

Get the posterior probabilities for questioned document analyzed with analyze_questioned_documents().

Usage

get_posterior_probabilities(analysis, questioned_doc)
get_posterior_probabilities(analysis, questioned_doc)

Arguments

analysis

The output of analyze_questioned_documents(). If more than one questioned document was analyzed with this function, then the data frame analysis$posterior_probabilities lists the posterior probabilities for all questioned documents. get_posterior_probabilities() creates a data frame of the posterior probabilities for a single questioned document and sorts the known writers from the most likely to least likely to have written the questioned document.

questioned_doc

The filename of the questioned document

Value

A data frame of posterior probabilities for the questioned document

Examples

get_posterior_probabilities(
  analysis = example_analysis,
  questioned_doc = "w0030_s03_pWOZ_r01"
)

get_posterior_probabilities(
  analysis = example_analysis,
  questioned_doc = "w0030_s03_pWOZ_r01"
)

Estimate Writer Profiles

Description

Estimate writer profiles from handwritten documents scanned and saved as PNG files. Each file in input_dir is split into component shapes called graphs with process_batch_dir. Then the graphs are sorted into clusters with similar shapes using the cluster template and get_clusters_batch. An estimate of the writer profile for a document is the proportion of graphs from that document assigned to each of the clusters in template. The writer profiles are estimated by running get_cluster_fill_counts. If measure is counts than the cluster fill counts are returned. If measure is rates than get_cluster_fill_rates is run and the cluster fill rates are returned.

Usage

get_writer_profiles(
  input_dir,
  measure = "counts",
  num_cores = 1,
  template = templateK40,
  writer_indices = NULL,
  doc_indices = NULL,
  output_dir = NULL
)
get_writer_profiles(
  input_dir,
  measure = "counts",
  num_cores = 1,
  template = templateK40,
  writer_indices = NULL,
  doc_indices = NULL,
  output_dir = NULL
)

Arguments

`input_dir`	A filepath to a folder containing one or more handwritten documents, scanned and saved as PNG file(s).
`measure`	A character string: either `counts` or `rates`. `counts` returns the cluster fill counts, I.e., the number of graphs assigned to each cluster. `rates` returns the cluster fill rates, I.e., the proportion of graphs assigned to each cluster.
`num_cores`	An integer number greater than or equal to 1 of cores to use for parallel processing.
`template`	Optional. A cluster template created with `make_clustering_template`. The default is `templateK40`.
`writer_indices`	A vector of start and stop characters for writer IDs in file names
`doc_indices`	A vector of start and stop characters for document names in file names
`output_dir`	Optional. A filepath to a folder to save the RDS files created by `process_batch_dir` and `get_clusters_batch`. If no folder is supplied, the RDS files will be saved to the temporary directory and then deleted before the function terminates.

Details

The functions process_batch_dir and get_clusters_batch take upwards of 30 seconds per document and the results are saved to RDS files in project_dir > graphs and project_dir > clusters, respectively. If project_dir is NULL than the results are saved to the temporary directory and deleted before the function terminates.

Value

A data frame

Examples


docs <- system.file(file.path("extdata"), package = "handwriter")
profiles <- get_writer_profiles(docs, measure = "counts")
plot_writer_profiles(profiles)

profiles <- get_writer_profiles(docs, measure = "rates")
plot_writer_profiles(profiles)


docs <- system.file(file.path("extdata"), package = "handwriter")
profiles <- get_writer_profiles(docs, measure = "counts")
plot_writer_profiles(profiles)

profiles <- get_writer_profiles(docs, measure = "rates")
plot_writer_profiles(profiles)

Convert graph to a prototype

Description

A graph prototype consists of the starting and ending points of each path in the graph, as well as and evenly spaced points along each path. The prototype also stores the center point of the graph. All points are represented as xy-coordinates and the center point is at (0,0).

Usage

graphToPrototype(graph, numPathCuts = 8)
graphToPrototype(graph, numPathCuts = 8)

Arguments

`graph`	A graph from a handwriting sample
`numPathCuts`	Number of segments to cut the path(s) into

Value

List of pathEnds, pathQuarters, and pathCenters given as (x,y) coordinates with the graph centroid at (0,0). The returned list also contains path lengths. pathQuarters gives the (x,y) coordinates of the path at the cut points and despite the name, the path might not be cut into quarters.

Cursive written word: London

Description

Cursive written word: London

Usage

london
london

Format

Binary image matrix. 148 rows and 481 columns.

Examples

london_document <- list()
london_document$image <- london
plotImage(london_document)
london_document$thin <- thinImage(london_document$image)
plotImageThinned(london_document)
london_processList <- processHandwriting(london_document$thin, dim(london_document$image))
london_document <- list()
london_document$image <- london
plotImage(london_document)
london_document$thin <- thinImage(london_document$image)
plotImageThinned(london_document)
london_processList <- processHandwriting(london_document$thin, dim(london_document$image))

Make Clustering Template

Description

make_clustering_template() applies a K-means clustering algorithm to the input handwriting samples pre-processed with process_batch_dir() and saved in the input folder ⁠main_dir > data > template_graphs⁠. The K-means algorithm sorts the graphs in the input handwriting samples into groups, or clusters, of similar graphs.

Usage

make_clustering_template(
  main_dir,
  template_docs,
  writer_indices,
  centers_seed,
  K = 40,
  num_dist_cores = 1,
  max_iters = 25
)
make_clustering_template(
  main_dir,
  template_docs,
  writer_indices,
  centers_seed,
  K = 40,
  num_dist_cores = 1,
  max_iters = 25
)

Arguments

`main_dir`	Main directory that will store template files
`template_docs`	A directory containing template training images
`writer_indices`	A vector of the starting and ending location of the writer ID in the file name.
`centers_seed`	Integer seed for the random number generator when selecting starting cluster centers.
`K`	Integer number of clusters
`num_dist_cores`	Integer number of cores to use for the distance calculations in the K-means algorithm. Each iteration of the K-means algorithm calculates the distance between each input graph and each cluster center.
`max_iters`	Maximum number of iterations to allow the K-means algorithm to run

Value

List containing the cluster template

Examples

## Not run: 
main_dir <- "path/to/folder"
template_docs <- "path/to/template_training_docs"
template_list <- make_clustering_template(
  main_dir = main_dir,
  template_docs = template_docs,
  writer_indices = c(2, 5),
  K = 10,
  num_dist_cores = 2,
  max_iters = 25,
  centers_seed = 100,
)

## End(Not run)

## Not run: 
main_dir <- "path/to/folder"
template_docs <- "path/to/template_training_docs"
template_list <- make_clustering_template(
  main_dir = main_dir,
  template_docs = template_docs,
  writer_indices = c(2, 5),
  K = 10,
  num_dist_cores = 2,
  max_iters = 25,
  centers_seed = 100,
)

## End(Not run)

Full page image of the handwritten London letter.

Description

Full page image of the handwritten London letter.

Usage

message
message

Format

Binary image matrix. 1262 rows and 1162 columns.

Examples

message_document <- list()
message_document$image <- message
plotImage(message_document)

## Not run: 
message_document <- list()
message_document$image <- message
plotImage(message_document)
message_document$thin <- thinImage(message_document$image)
plotImageThinned(message_document)
message_processList <- processHandwriting(message_document$thin, dim(message_document$image))

## End(Not run)
message_document <- list()
message_document$image <- message
plotImage(message_document)

## Not run: 
message_document <- list()
message_document$image <- message
plotImage(message_document)
message_document$thin <- thinImage(message_document$image)
plotImageThinned(message_document)
message_processList <- processHandwriting(message_document$thin, dim(message_document$image))

## End(Not run)

Full page image of the 4th sample (nature) of handwriting from the first writer.

Description

Full page image of the 4th sample (nature) of handwriting from the first writer.

Usage

nature1
nature1

Format

Binary image matrix. 811 rows and 1590 columns.

Examples

nature1_document <- list()
nature1_document$image <- nature1
plotImage(nature1_document)

## Not run: 
nature1_document <- list()
nature1_document$image <- nature1
plotImage(nature1_document)
nature1_document$thin <- thinImage(nature1_document$image)
plotImageThinned(nature1_document)
nature1_processList <- processHandwriting(nature1_document$thin, dim(nature1_document$image))

## End(Not run)
nature1_document <- list()
nature1_document$image <- nature1
plotImage(nature1_document)

## Not run: 
nature1_document <- list()
nature1_document$image <- nature1
plotImage(nature1_document)
nature1_document$thin <- thinImage(nature1_document$image)
plotImageThinned(nature1_document)
nature1_processList <- processHandwriting(nature1_document$thin, dim(nature1_document$image))

## End(Not run)

Plot Template Cluster Centers

Description

Plot the cluster centers of a cluster template created with make_clustering_template. This function uses a K-Means type algorithm to sort graphs from training documents into clusters. On each iteration of the algorithm, it calculates the mean graph of each cluster and finds the graph in each cluster that is closest to the mean graph. The graphs closest to the mean graphs are used as the cluster centers for the next iteration. Handwriter stores the cluster centers of a cluster template as graph prototypes. A graph prototype consists of the starting and ending points of each path in the graph, as well as and evenly spaced points along each path. The prototype also stores the center point of the graph. All points are represented as xy-coordinates and the center point is at (0,0).

Usage

plot_cluster_centers(template, plot_graphs = FALSE, size = 100)
plot_cluster_centers(template, plot_graphs = FALSE, size = 100)

Arguments

`template`	A cluster template created with `make_clustering_template`
`plot_graphs`	TRUE plots all graphs in each cluster in addition to the cluster centers. FALSE only plots the cluster centers.
`size`	The size of the output plot

Value

A plot

Examples

# plot cluster centers from example template
plot_cluster_centers(example_cluster_template)
plot_cluster_centers(example_cluster_template, plot_graphs = TRUE)

# plot cluster centers from example template
plot_cluster_centers(example_cluster_template)
plot_cluster_centers(example_cluster_template, plot_graphs = TRUE)

Plot Cluster Fill Counts

Description

Plot the cluster fill counts for each document in formatted_data.

Usage

plot_cluster_fill_counts(formatted_data, facet = TRUE)
plot_cluster_fill_counts(formatted_data, facet = TRUE)

Arguments

`formatted_data`	Data created by `format_template_data()`, `fit_model()`, or `analyze_questioned_documents()`
`facet`	`TRUE` uses `facet_wrap` to create a subplot for each writer. `FALSE` plots the data on a single plot.

Value

ggplot plot of cluster fill counts

Examples

# Plot cluster fill counts for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_counts(formatted_data = template_data, facet = TRUE)

# Plot cluster fill counts for model training documents
plot_cluster_fill_counts(formatted_data = example_model, facet = TRUE)

# Plot cluster fill counts for questioned documents
plot_cluster_fill_counts(formatted_data = example_analysis, facet = FALSE)

# Plot cluster fill counts for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_counts(formatted_data = template_data, facet = TRUE)

# Plot cluster fill counts for model training documents
plot_cluster_fill_counts(formatted_data = example_model, facet = TRUE)

# Plot cluster fill counts for questioned documents
plot_cluster_fill_counts(formatted_data = example_analysis, facet = FALSE)

Plot Cluster Fill Rates

Description

Plot the cluster fill rates for each document in formatted_data.

Usage

plot_cluster_fill_rates(formatted_data, facet = FALSE)
plot_cluster_fill_rates(formatted_data, facet = FALSE)

Arguments

`formatted_data`	Data created by `format_template_data()`, `fit_model()`, or `analyze_questioned_documents()`
`facet`	`TRUE` uses `facet_wrap` to create a subplot for each writer. `FALSE` plots the data on a single plot.

Value

ggplot plot of cluster fill rates

Examples

# Plot cluster fill rates for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_rates(formatted_data = template_data, facet = TRUE)

# Plot cluster fill rates for model training documents
plot_cluster_fill_rates(formatted_data = example_model, facet = TRUE)

# Plot cluster fill rates for questioned documents
plot_cluster_fill_rates(formatted_data = example_analysis, facet = FALSE)

# Plot cluster fill rates for template training documents
template_data <- format_template_data(example_cluster_template)
plot_cluster_fill_rates(formatted_data = template_data, facet = TRUE)

# Plot cluster fill rates for model training documents
plot_cluster_fill_rates(formatted_data = example_model, facet = TRUE)

# Plot cluster fill rates for questioned documents
plot_cluster_fill_rates(formatted_data = example_analysis, facet = FALSE)

Plot Credible Intervals

Description

Plot credible intervals for the model's pi parameters that estimate the true writer cluster fill counts.

Usage

plot_credible_intervals(
  model,
  interval_min = 0.025,
  interval_max = 0.975,
  facet = FALSE
)
plot_credible_intervals(
  model,
  interval_min = 0.025,
  interval_max = 0.975,
  facet = FALSE
)

Arguments

`model`	A model created by `fit_model()`
`interval_min`	The lower bound of the credible interval. It must be greater than zero and less than 1.
`interval_max`	The upper bound of the credible interval. It must be greater than the interval minimum and less than 1.
`facet`	`TRUE` uses `facet_wrap` to create a subplot for each writer. `FALSE` plots the data on a single plot.

Value

ggplot plot credible intervals

Examples

plot_credible_intervals(model = example_model)
plot_credible_intervals(model = example_model, facet = TRUE)

plot_credible_intervals(model = example_model)
plot_credible_intervals(model = example_model, facet = TRUE)

Plot Graphs

Description

Use processDocument() to split handwritting into component shapes called graphs. plot_graphs() creates a plot that displays the graphs. ggplot2::facet_wrap() places each graph in its own facet, and ncol sets the number of columns of facets.

Usage

plot_graphs(doc, ncol = NULL)
plot_graphs(doc, ncol = NULL)

Arguments

`doc`	A PNG image of handwriting processed with `processDocument()`.
`ncol`	Optionally, set the number of columns in the output plot. The default is `NULL` which allows `ggplot2::facet_wrap()` to automatically choose the number of columns.

Value

A plot of all graphs in the document

Examples

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- processDocument(image_path)
plot_graphs(doc)

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- processDocument(image_path)
plot_graphs(doc)

Plot Posterior Probabilities

Description

Creates a tile plot of posterior probabilities of writership for each questioned document and each known writer analyzed with analyze_questioned_documents().

Usage

plot_posterior_probabilities(analysis)
plot_posterior_probabilities(analysis)

Arguments

analysis

A named list of analysis results from analyze_questioned_documents().

Value

A tile plot of posterior probabilities of writership.

Examples

plot_posterior_probabilities(analysis = example_analysis)

plot_posterior_probabilities(analysis = example_analysis)

Plot Trace

Description

Create a trace plot for all chains for a single variable of a fitted model created by fit_model(). If the model contains more than one chain, the chains will be combined by pasting them together.

Usage

plot_trace(variable, model)
plot_trace(variable, model)

Arguments

`variable`	The name of a variable in the model
`model`	A model created by `fit_model()`

Value

A trace plot

Examples

plot_trace(model = example_model, variable = "pi[1,1]")
plot_trace(model = example_model, variable = "mu[2,3]")

plot_trace(model = example_model, variable = "pi[1,1]")
plot_trace(model = example_model, variable = "mu[2,3]")

Plot Writer Profiles

Description

Create a line plot of writer profiles for one or more documents.

Usage

plot_writer_profiles(profiles, color_by = "docname", ...)
plot_writer_profiles(profiles, color_by = "docname", ...)

Arguments

`profiles`	A data frame of writer profiles created with `{get_writer_profiles}`.
`color_by`	A column name. 'ggplot2' will always group by docname, but will use this column to assign colors.
`...`	Additional arguments passed to `ggplot2::facet_wrap`, such as `facets`, `nrow`, etc.

Value

A line plot

Examples


docs <- system.file(file.path("extdata"), package = "handwriter")
profiles <- get_writer_profiles(docs, measure = "counts")
plot_writer_profiles(profiles)

profiles <- get_writer_profiles(docs, measure = "rates")
plot_writer_profiles(profiles)


docs <- system.file(file.path("extdata"), package = "handwriter")
profiles <- get_writer_profiles(docs, measure = "counts")
plot_writer_profiles(profiles)

profiles <- get_writer_profiles(docs, measure = "rates")
plot_writer_profiles(profiles)

Plot Image

Description

This function plots a basic black and white image.

Usage

plotImage(doc)
plotImage(doc)

Arguments

doc

A document processed with processDocument() or a binary matrix (all entries are 0 or 1)

Value

ggplot plot

Examples

csafe_document <- list()
csafe_document$image <- csafe
plotImage(csafe_document)

## Not run: 
document <- processDocument('path/to/image.png')
plotImage(document)

## End(Not run)

csafe_document <- list()
csafe_document$image <- csafe
plotImage(csafe_document)

## Not run: 
document <- processDocument('path/to/image.png')
plotImage(document)

## End(Not run)

Plot Thinned Image

Description

This function returns a plot with the full image plotted in light gray and the thinned skeleton printed in black on top.

Usage

plotImageThinned(doc)
plotImageThinned(doc)

Arguments

doc

A document processed with processHandwriting()

Value

gpplot plot of thinned image

Examples

csafe_document <- list()
csafe_document$image <- csafe
csafe_document$thin <- thinImage(csafe_document$image)
plotImageThinned(csafe_document)

csafe_document <- list()
csafe_document$image <- csafe
csafe_document$thin <- thinImage(csafe_document$image)
plotImageThinned(csafe_document)

Plot Letter

Description

This function returns a plot of a single graph extracted from a document. It uses the letterList parameter from the processHandwriting() or processDocument() function and accepts a single value as whichLetter. Dims requires the dimensions of the entire document, since this isn't contained in processHandwriting() or processDocument().

Usage

plotLetter(
  doc,
  whichLetter,
  showPaths = TRUE,
  showCentroid = TRUE,
  showSlope = TRUE,
  showNodes = TRUE
)
plotLetter(
  doc,
  whichLetter,
  showPaths = TRUE,
  showCentroid = TRUE,
  showSlope = TRUE,
  showNodes = TRUE
)

Arguments

`doc`	A document processed with `processHandwriting()` or `processDocument()`
`whichLetter`	Single value in 1:length(letterList) denoting which letter to plot.
`showPaths`	Whether the calculated paths on the letter should be shown with numbers.
`showCentroid`	Whether the centroid should be shown
`showSlope`	Whether the slope should be shown
`showNodes`	Whether the nodes should be shown

Value

Plot of single letter.

Examples

twoSent_document = list()
twoSent_document$image = twoSent
twoSent_document$thin = thinImage(twoSent_document$image)
twoSent_document$process = processHandwriting(twoSent_document$thin, dim(twoSent_document$image))
plotLetter(twoSent_document, 1)
plotLetter(twoSent_document, 4, showPaths = FALSE)

twoSent_document = list()
twoSent_document$image = twoSent
twoSent_document$thin = thinImage(twoSent_document$image)
twoSent_document$process = processHandwriting(twoSent_document$thin, dim(twoSent_document$image))
plotLetter(twoSent_document, 1)
plotLetter(twoSent_document, 4, showPaths = FALSE)

Plot Line

Description

This function returns a plot of a single line extracted from a document. It uses the letterList parameter from the processHandwriting function and accepts a single value as whichLetter. Dims requires the dimensions of the entire document, since this isn't contained in processHandwriting.

Usage

plotLine(letterList, whichLine, dims)
plotLine(letterList, whichLine, dims)

Arguments

`letterList`	Letter list from processHandwriting function
`whichLine`	Single value denoting which line to plot - checked if too big inside function.
`dims`	Dimensions of the original document

Value

ggplot plot of single line

Examples

twoSent_document = list()
twoSent_document$image = twoSent
twoSent_document$thin = thinImage(twoSent_document$image)
twoSent_processList = processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

dims = dim(twoSent_document$image)
plotLine(twoSent_processList$letterList, 1, dims)

twoSent_document = list()
twoSent_document$image = twoSent
twoSent_document$thin = thinImage(twoSent_document$image)
twoSent_processList = processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

dims = dim(twoSent_document$image)
plotLine(twoSent_processList$letterList, 1, dims)

Plot Nodes

Description

This function returns a plot with the full image plotted in light gray and the skeleton printed in black, with red triangles over the vertices. Also called from plotPath, which is a more useful function, in general.

Usage

plotNodes(doc, plot_break_pts = FALSE, nodeSize = 3, nodeColor = "red")
plotNodes(doc, plot_break_pts = FALSE, nodeSize = 3, nodeColor = "red")

Arguments

`doc`	A document processed with `processHandwriting()`
`plot_break_pts`	Logical value as to whether to plot nodes or break points. plot_break_pts=FALSE plots nodes and plot_break_pts=TRUE plots break point.
`nodeSize`	Size of triangles printed. 3 by default. Move down to 2 or 1 for small text images.
`nodeColor`	Which color the nodes should be

Value

Plot of full and thinned image with vertices overlaid.

Examples

csafe_document <- list()
csafe_document$image <- csafe
csafe_document$thin <- thinImage(csafe_document$image)
csafe_document$process <- processHandwriting(csafe_document$thin, dim(csafe_document$image))
plotNodes(csafe_document)
plotNodes(csafe_document, nodeSize=6, nodeColor="black")

csafe_document <- list()
csafe_document$image <- csafe
csafe_document$thin <- thinImage(csafe_document$image)
csafe_document$process <- processHandwriting(csafe_document$thin, dim(csafe_document$image))
plotNodes(csafe_document)
plotNodes(csafe_document, nodeSize=6, nodeColor="black")

Process Batch Directory

Description

Process a list of handwriting samples saved as PNG images in a directory: (1) Load the image and convert it to black and white with readPNGBinary() (2) Thin the handwriting to one pixel in width with thinImage() (3) Run processHandwriting() to split the handwriting into parts called edges and place nodes at the ends of edges. Then combine edges into component shapes called graphs. (4) Save the processed document in an RDS file. (5) Optional. Return a list of the processed documents.

Usage

process_batch_dir(input_dir, output_dir = ".", skip_docs_on_retry = TRUE)
process_batch_dir(input_dir, output_dir = ".", skip_docs_on_retry = TRUE)

Arguments

`input_dir`	Input directory that contains images
`output_dir`	A directory to save the processed images
`skip_docs_on_retry`	Logical whether to skip documents in input_dir that caused errors on a previous run. The errors and document names are stored in output_dir > problems.txt. If this is the first run, `process_batch_list` will attempt to process all documents in input_dir.

Value

No return value, called for side effects

Examples

## Not run: 
process_batch_dir("path/to/input_dir", "path/to/output_dir")

## End(Not run)

## Not run: 
process_batch_dir("path/to/input_dir", "path/to/output_dir")

## End(Not run)

Process Batch List

Description

Process a list of handwriting samples saved as PNG images: (1) Load the image and convert it to black and white with readPNGBinary() (2) Thin the handwriting to one pixel in width with thinImage() (3) Run processHandwriting() to split the handwriting into parts called edges and place nodes at the ends of edges. Then combine edges into component shapes called graphs. (4) Save the processed document in an RDS file. (5) Optional. Return a list of the processed documents.

Usage

process_batch_list(images, output_dir, skip_docs_on_retry = TRUE)
process_batch_list(images, output_dir, skip_docs_on_retry = TRUE)

Arguments

`images`	A vector of image file paths
`output_dir`	A directory to save the processed images
`skip_docs_on_retry`	Logical whether to skip documents in the images arguement that caused errors on a previous run. The errors and document names are stored in output_dir > problems.txt. If this is the first run, `process_batch_list` will attempt to process all documents in the images arguement.

Value

No return value, called for side effects

Examples

## Not run: 
images <- c('path/to/image1.png', 'path/to/image2.png', 'path/to/image3.png')
process_batch_list(images, "path/to/output_dir", FALSE)
process_batch_list(images, "path/to/output_dir", TRUE)

## End(Not run)

## Not run: 
images <- c('path/to/image1.png', 'path/to/image2.png', 'path/to/image3.png')
process_batch_list(images, "path/to/output_dir", FALSE)
process_batch_list(images, "path/to/output_dir", TRUE)

## End(Not run)

Process Document

Description

Load a handwriting sample from a PNG image. Then binarize, thin, and split the handwriting into graphs.

Usage

processDocument(path)
processDocument(path)

Arguments

path

File path for handwriting document. The document must be in PNG file format.

Value

The processed document as a list

Examples

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- processDocument(image_path)
plotImage(doc)
plotImageThinned(doc)
plotNodes(doc)

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- processDocument(image_path)
plotImage(doc)
plotImageThinned(doc)
plotNodes(doc)

Process Handwriting by Component

Description

The main driver of handwriting processing. Takes in an image of thinned handwriting created with thinImage() and splits the the handwriting into shapes called graphs. Instead of processing the entire document at once, the thinned writing is separated into connected components and each component is split into graphs.

Usage

processHandwriting(img, dims)
processHandwriting(img, dims)

Arguments

`img`	Thinned binary image created with `thinImage()`.
`dims`	Dimensions of thinned binary image.

Value

A list of the processed image

Examples

twoSent_document <- list()
twoSent_document$image <- twoSent
twoSent_document$thin <- thinImage(twoSent_document$image)
twoSent_processList <- processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

twoSent_document <- list()
twoSent_document$image <- twoSent
twoSent_document$thin <- thinImage(twoSent_document$image)
twoSent_processList <- processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

Read and Process

Description

Development on read_and_process() is complete. We recommend using processDocument(). read_and_process(image_name, "document") is equivalent to processDocument(image_name).

Usage

read_and_process(image_name, transform_output)
read_and_process(image_name, transform_output)

Arguments

`image_name`	The file path to an image
`transform_output`	The type of transformation to perform on the output

Value

A list of the processed image components

Examples

# use handwriting example from handwriter package
image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- read_and_process(image_path, "document")

# use handwriting example from handwriter package
image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
doc <- read_and_process(image_path, "document")

Read PNG Binary

Description

This function reads in and binarizes a PNG image.

Usage

readPNGBinary(
  path,
  cutoffAdjust = 0,
  clean = TRUE,
  crop = TRUE,
  inversion = FALSE
)
readPNGBinary(
  path,
  cutoffAdjust = 0,
  clean = TRUE,
  crop = TRUE,
  inversion = FALSE
)

Arguments

`path`	File path for image.
`cutoffAdjust`	Multiplicative adjustment to the K-means estimated binarization cutoff.
`clean`	Whether to fill in white pixels with 7 or 8 neighbors. This will help a lot when thinning – keeps from getting little white bubbles in text.
`crop`	Logical value dictating whether or not to crop the white out around the image. TRUE by default.
`inversion`	Logical value dictating whether or not to flip each pixel of binarized image. Flipping happens after binarization. FALSE by default.

Value

Returns image from path. 0 represents black, and 1 represents white by default.

Examples

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
csafe_document <- list()
csafe_document$image <- readPNGBinary(image_path)
plotImage(csafe_document)

image_path <- system.file("extdata", "phrase_example.png", package = "handwriter")
csafe_document <- list()
csafe_document$image <- readPNGBinary(image_path)
plotImage(csafe_document)

rgba2grayscale

Description

Changes RGB image to grayscale

Usage

rgb2grayscale(img)
rgb2grayscale(img)

Arguments

img

A 3D array with slices R, G, and B

Value

img as a 3D array as grayscale

rgba2rgb

Description

Removes alpha channel from png image.

Usage

rgba2rgb(img)
rgba2rgb(img)

Arguments

img

A 3-d array with slices R, G, B, and alpha.

Value

img as a 3D array with alpha channel removed

Cluster Template with 40 Clusters

Description

A cluster template with 40 clusters created with make_clustering_template. This template was created from handwriting samples from the CSAFE Handwriting Database, the CVL Handwriting Database, and the IAM Handwriting Database.

Usage

templateK40
templateK40

Format

A list containing the contents of the cluster template.

cluster: A vector of cluster assignments for each graph used to create the cluster template. The clusters are numbered sequentially 1, 2,...,K.
centers: The final cluster centers produced by the K-Means algorithm.
K: The number of clusters in the template.
n: The number of training graphs to used to create the template.
iters: The maximum number of iterations for the K-means algorithm.
WithinClustDists: The within cluster distances on the final iteration of the K-means algorithm. More specifically, the distance between each graph and the center of the cluster to which it was assigned on each iteration. The output of make_clustering_template stores the within cluster distances on each iteration, but the previous iterations were removed here to reduce the file size.

Details

'handwriter' splits handwriting samples into component shapes called graphs. The graphs are sorted into 40 clusters with a K-Means algorithm.

Examples

# view number of clusters
templateK40$K

# view number of iterations
templateK40$iters

# view cluster centers
plot_cluster_centers(templateK40)

# view number of clusters
templateK40$K

# view number of iterations
templateK40$iters

# view cluster centers
plot_cluster_centers(templateK40)

thinImage

Description

This function returns a vector of locations for black pixels in the thinned image. Thinning done using Zhang - Suen algorithm.

Usage

thinImage(img)
thinImage(img)

Arguments

img

A binary matrix of the text that is to be thinned.

Value

A thinned, one pixel wide, image.

Two sentence printed example handwriting

Description

Two sentence printed example handwriting

Usage

twoSent
twoSent

Format

Binary image matrix. 396 rows and 1947 columns

Examples

twoSent_document <- list()
twoSent_document$image <- twoSent
plotImage(twoSent_document)

## Not run: 
twoSent_document <- list()
twoSent_document$image <- twoSent
plotImage(twoSent_document)
twoSent_document$thin <- thinImage(twoSent_document$image)
plotImageThinned(twoSent_document)
twoSent_processList <- processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

## End(Not run)
twoSent_document <- list()
twoSent_document$image <- twoSent
plotImage(twoSent_document)

## Not run: 
twoSent_document <- list()
twoSent_document$image <- twoSent
plotImage(twoSent_document)
twoSent_document$thin <- thinImage(twoSent_document$image)
plotImageThinned(twoSent_document)
twoSent_processList <- processHandwriting(twoSent_document$thin, dim(twoSent_document$image))

## End(Not run)

whichToFill

Description

Finds pixels in the plot that shouldn't be white and makes them black. Quick and helpful cleaning for before the thinning algorithm runs.

Usage

whichToFill(img)
whichToFill(img)

Arguments

img

A binary matrix.

Value

A cleaned up image.

Package 'handwriter'

Help Index

About Varialbe

Description

Usage

Arguments

Value

Examples

addToFeatures

Description

Usage

Arguments

Value

Analyze Questioned Documents

Description

Usage

Arguments

Value

Examples

Calculate Accuracy

Description

Usage

Arguments

Value

Examples

cleanBinaryImage

Description

Usage

Arguments

Value

Cursive written word: csafe

Description

Usage

Format

Examples

Drop Burn-In

Description

Usage

Arguments

Value

Examples

Example of writership analysis

Description

Usage

Format

Examples

Example cluster template

Description

Usage

Format

Examples

Example of a hierarchical model

Description

Usage

Format

Examples

Extract Graphs

Description

Usage

Arguments

Details

Value

Examples

Fit Model

Description

Usage

Arguments

Value

Examples

Format Template Data

Description

Usage

Arguments

Value

Examples

Get Cluster Fill Counts

Description

Usage

Arguments

Value