GitXplorerGitXplorer
G

Benefault

public
2 stars
1 forks
0 issues

Commits

List of commits on branch master.
Verified
901d915007184db64277f414df520150313a51d7

Update README.md

GGuanhuaWang committed 7 years ago
Verified
fc0e327df5a5f2e4203c75ab749ab805c8b79e2b

Update JCudaDFVectorAdd.scala

GGuanhuaWang committed 7 years ago
Verified
e758578beeef8fa098f47211b976a160512688f2

Update PhaseCount.scala

GGuanhuaWang committed 7 years ago
Verified
b5fa8e4c36c8378bd05b9ebb0eb47a553a9d732e

Update JCudaDFVectorAdd.scala

GGuanhuaWang committed 7 years ago
Unverified
2d907d5535b80affc273de5a5adb6300791e3f21

Update wordapp.scala

GGuanhuaWang committed 7 years ago
Unverified
56a3ac89b5adc83ec7d20cbab966ecc9deee49af

Update sort.sbt

GGuanhuaWang committed 8 years ago

README

The README file for this repository.

Benefault

License

A way for task preemption in Big data analytics platform

What we have done

  • a simple shell script for monitoring node's metadata (e.g. disk access, network Tx Rx etc) in a cluster
  • read and write for chekcpointing data (note: checkpointRead is private in spark, we need to package function into org.apache.spark)

We have already done some simulation about the JCT gain we can get using Benefault

The performance gain is 15-30%

We test latency in varied scenarios

  • Measure checkpoint latency using Spark
  • Word Count with checkpointing
  • Sorting with checkpoint
  • GroupByKey with Checkpointing
  • DecisionTree with periodic Checkpointing
  • We now design schemes for evaluate best gain we can get using Benefault
  • find sweet spot for whether kill or preempt